Hi folks,

I just recently merged my vendor branch of Mahout with Mahout trunk and found that Mahout now supports Hadoop 0.20. Now, with Hadoop 0.20, we now have the ability to use capacity scheduler instead of hod. There are two ways to pass on the capacity scheduler queue name to a Mahout driver class like KMeansDriver:

1. Have KMeansDriver extend 'Configured' and implement 'Tool' interface to allow command line specification of the scheduler queue name as in -Dmapred.job.queue.name=myqueuename
2. Add jobConfi.set() while setting up the drivers.

Personally, I prefer the first solution. Are there any plans on updating the various driver classes to support such capacity scheduler queues? Either way, I can help out in the process.

Adil

Reply via email to