canopy clustering fails if --Dmapred.job.queue.name=unfunded is specified to
mahout driver command line
-------------------------------------------------------------------------------------------------------
Key: MAHOUT-578
URL: https://issues.apache.org/jira/browse/MAHOUT-578
Project: Mahout
Issue Type: Bug
Components: Clustering
Affects Versions: 0.4
Environment: Linux 2.6.18-164.el5 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Shige Takeda
Hi, I would like to demonstrate -D option issues by showing one concrete
example, and would like to propose the fix.
When I want to run canopy clustering using mahout driver, the command line is
something like this. NOTE: -Dmapred.job.queue.name is required Job config in
the company's environment.
$MAHOUT_HOME/bin/mahout canopy \
-i input \
-o output \
-dm org.apache.mahout.common.distance.EuclideanDistanceMeasure \
-t1 2.0 \
-t2 0.05 \
-cl \
-Dmapred.job.queue.name=unfunded
and I get the error:
Running on hadoop, using HADOOP_HOME=/grid/0/gs/hadoop/current
HADOOP_CONF_DIR=/grid/0/gs/conf/current
11/01/05 20:19:15 ERROR common.AbstractJob: Unexpected
-Dmapred.job.queue.name=unfunded while processing Job-Specific Options:
This is because -D parameter is NOT parsed properly by ToolRunner.run but
passed through to CanopyDriver's command line option parsers.
ToolRunner.run(Tool,String[]) should be used rather than
ToolRunner.run(Configuraiton,Tool,String[]) to get -D parameter processed.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.