[
https://issues.apache.org/jira/browse/MAHOUT-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409560#comment-13409560
]
jayghost edited comment on MAHOUT-1034 at 7/9/12 3:22 PM:
----------------------------------------------------------
I try to use -D numLabels=500 as Generic Options, but it shows another error.
hadoop@master:~/program/mahout-distribution-0.7$ bin/mahout trainnb -D
numLabels=5000 -i
~/Downloads/20news-bydate/20news-bydate-train-vectors/tfidf-vectors -o
~/Downloads/20news-bydate/model/ -el -li ~/Downloads/20news-bydate/labelindex
-owMAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Warning: $HADOOP_HOME is deprecated.
Running on hadoop, using /home/hadoop/program/hadoop-1.0.1/bin/hadoop and
HADOOP_CONF_DIR=/home/hadoop/program/hadoop-1.0.1/conf
MAHOUT-JOB:
/home/hadoop/program/mahout-distribution-0.7/mahout-examples-0.7-job.jar
Warning: $HADOOP_HOME is deprecated.
12/07/09 23:18:27 WARN driver.MahoutDriver: No trainnb.props found on
classpath, will use command-line arguments only
12/07/09 23:18:27 ERROR common.AbstractJob: Unexpected
/home/hadoop/Downloads/20news-bydate/model/ while processing Job-Specific
Options:
usage: <command> [Generic Options] [Job-Specific Options]
Generic Options:
-archives <paths> comma separated archives to be unarchived
on the compute machines.
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-files <paths> comma separated files to be copied to the
map reduce cluster
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-libjars <paths> comma separated jar files to include in
the classpath.
-tokenCacheFile <tokensFile> name of the file with the tokens
Unexpected /home/hadoop/Downloads/20news-bydate/model/ while processing
Job-Specific Options:
Usage:
[--input <input> --output <output> --labels <labels> --extractLabels --alphaI
<alphaI> --trainComplementary --labelIndex <labelIndex> --overwrite --help
--tempDir <tempDir> --startPhase <startPhase> --endPhase <endPhase>]
Job-Specific Options:
--input (-i) input Path to job input directory.
--output (-o) output The directory pathname for output.
--labels (-l) labels comma-separated list of labels to include in
training
--extractLabels (-el) Extract the labels from the input
--alphaI (-a) alphaI smoothing parameter
--trainComplementary (-c) train complementary?
--labelIndex (-li) labelIndex The path to store the label index in
--overwrite (-ow) If present, overwrite the output directory
before running job
--help (-h) Print out help
--tempDir tempDir Intermediate output directory
--startPhase startPhase First phase to run
--endPhase endPhase Last phase to run
12/07/09 23:18:27 INFO driver.MahoutDriver: Program took 436 ms (Minutes:
0.007266666666666667)
How can I add the numLabels optition? Help pls!!! Thanks!
was (Author: jayghost):
I try to use -D numLabels=500 as Generic Options, but it shows another
error.
{hadoop@master:~/program/mahout-distribution-0.7$ bin/mahout trainnb -D
numLabels=5000 -i
~/Downloads/20news-bydate/20news-bydate-train-vectors/tfidf-vectors -o
~/Downloads/20news-bydate/model/ -el -li ~/Downloads/20news-bydate/labelindex
-owMAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Warning: $HADOOP_HOME is deprecated.
Running on hadoop, using /home/hadoop/program/hadoop-1.0.1/bin/hadoop and
HADOOP_CONF_DIR=/home/hadoop/program/hadoop-1.0.1/conf
MAHOUT-JOB:
/home/hadoop/program/mahout-distribution-0.7/mahout-examples-0.7-job.jar
Warning: $HADOOP_HOME is deprecated.
12/07/09 23:18:27 WARN driver.MahoutDriver: No trainnb.props found on
classpath, will use command-line arguments only
12/07/09 23:18:27 ERROR common.AbstractJob: Unexpected
/home/hadoop/Downloads/20news-bydate/model/ while processing Job-Specific
Options:
usage: <command> [Generic Options] [Job-Specific Options]
Generic Options:
-archives <paths> comma separated archives to be unarchived
on the compute machines.
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-files <paths> comma separated files to be copied to the
map reduce cluster
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-libjars <paths> comma separated jar files to include in
the classpath.
-tokenCacheFile <tokensFile> name of the file with the tokens
Unexpected /home/hadoop/Downloads/20news-bydate/model/ while processing
Job-Specific Options:
Usage:
[--input <input> --output <output> --labels <labels> --extractLabels --alphaI
<alphaI> --trainComplementary --labelIndex <labelIndex> --overwrite --help
--tempDir <tempDir> --startPhase <startPhase> --endPhase <endPhase>]
Job-Specific Options:
--input (-i) input Path to job input directory.
--output (-o) output The directory pathname for output.
--labels (-l) labels comma-separated list of labels to include in
training
--extractLabels (-el) Extract the labels from the input
--alphaI (-a) alphaI smoothing parameter
--trainComplementary (-c) train complementary?
--labelIndex (-li) labelIndex The path to store the label index in
--overwrite (-ow) If present, overwrite the output directory
before running job
--help (-h) Print out help
--tempDir tempDir Intermediate output directory
--startPhase startPhase First phase to run
--endPhase endPhase Last phase to run
12/07/09 23:18:27 INFO driver.MahoutDriver: Program took 436 ms (Minutes:
0.007266666666666667)}
How can I add the numLabels optition? Help pls!!! Thanks!
> ERROR in Navie Bayes Training(trainnb)
> --------------------------------------
>
> Key: MAHOUT-1034
> URL: https://issues.apache.org/jira/browse/MAHOUT-1034
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Affects Versions: 0.7
> Environment: Ubuntu 11.04
> Reporter: Leting Wu
> Priority: Critical
>
> When run either examples/classify-20newsgrouops.sh or ash-email-examples.sh,
> trainnb always fails:
> {noformat}
> INFO mapred.JobClient: Task Id : attempt_201206281546_0003_m_000000_0, Status
> : FAILED
> java.lang.IllegalArgumentException
> at
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> at
> org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> at org.apache.hadoop.mapred.Child.main(Child.java:264)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira