canopy clustering fails if --Dmapred.job.queue.name=unfunded is specified to 
mahout driver command line
-------------------------------------------------------------------------------------------------------

                 Key: MAHOUT-578
                 URL: https://issues.apache.org/jira/browse/MAHOUT-578
             Project: Mahout
          Issue Type: Bug
          Components: Clustering
    Affects Versions: 0.4
         Environment: Linux 2.6.18-164.el5 x86_64 x86_64 x86_64 GNU/Linux
            Reporter: Shige Takeda


Hi, I would like to demonstrate -D option issues by showing one concrete 
example, and would like to propose the fix.

When I want to run canopy clustering using mahout driver, the command line is 
something like this. NOTE: -Dmapred.job.queue.name is required Job config in 
the company's environment.

$MAHOUT_HOME/bin/mahout canopy \
        -i input \
        -o output \
        -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure \
        -t1 2.0 \
        -t2 0.05 \
        -cl \
        -Dmapred.job.queue.name=unfunded

and I get the error:

Running on hadoop, using HADOOP_HOME=/grid/0/gs/hadoop/current
HADOOP_CONF_DIR=/grid/0/gs/conf/current
11/01/05 20:19:15 ERROR common.AbstractJob: Unexpected 
-Dmapred.job.queue.name=unfunded while processing Job-Specific Options:

This is because -D parameter is NOT parsed properly by ToolRunner.run but 
passed through to CanopyDriver's command line option parsers.

ToolRunner.run(Tool,String[]) should be used rather than 
ToolRunner.run(Configuraiton,Tool,String[]) to get -D parameter processed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to