Hi folks,
I just recently merged my vendor branch of Mahout with Mahout trunk and
found that Mahout now supports Hadoop 0.20. Now, with Hadoop 0.20, we
now have the ability to use capacity scheduler instead of hod. There are
two ways to pass on the capacity scheduler queue name to a Mahout driver
class like KMeansDriver:
1. Have KMeansDriver extend 'Configured' and implement 'Tool' interface
to allow command line specification of the scheduler queue name as in
-Dmapred.job.queue.name=myqueuename
2. Add jobConfi.set() while setting up the drivers.
Personally, I prefer the first solution. Are there any plans on updating
the various driver classes to support such capacity scheduler queues?
Either way, I can help out in the process.
Adil