Hi,

I'm having trouble running the KMeansDriver and I suspect that the problem is 
related to Adil's message. I'm in an environment which recently switched to 
Hadoop 0.2. I am no longer able to use hod as a scheduler. Furthermore, I'm 
forced to specify the queue (which unfortunately is not named default). This is 
normally done using -Dmapred.job.queue.name. Is there any way that I will be 
able to use Mahout, specifically the clustering code? When I run the 
KmeansDriver code with the -D option, it gives the following error message:

09/10/28 01:09:21 ERROR kmeans.KMeansDriver: Exception
org.apache.commons.cli2.OptionException: Unexpected -D while processing Options

On 9/14/09 3:19 PM, "Adil Aijaz" <[email protected]> wrote:

Hi folks,

I just recently merged my vendor branch of Mahout with Mahout trunk and
found that Mahout now supports Hadoop 0.20. Now, with Hadoop 0.20, we
now have the ability to use capacity scheduler instead of hod. There are
two ways to pass on the capacity scheduler queue name to a Mahout driver
class like KMeansDriver:

1. Have KMeansDriver extend 'Configured' and implement 'Tool' interface
to allow command line specification of the scheduler queue name as in
-Dmapred.job.queue.name=myqueuename
2. Add jobConfi.set() while setting up the drivers.

Personally, I prefer the first solution. Are there any plans on updating
the various driver classes to support such capacity scheduler queues?
Either way, I can help out in the process.

Adil

Reply via email to