[ 
https://issues.apache.org/jira/browse/MAHOUT-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leon lee resolved MAHOUT-476.
-----------------------------

    Fix Version/s: 0.3
       Resolution: Not A Problem

I found the reason why report error when running job.
somebody put a lucene 2.4 jar in the hadoop lib directory.   jars in hadoop's 
lib directory have higher priority in CLASSPATH.  So when running mahout 
example, it found the 2.4 lucene as its lib jar, not  the higher version lucene 
in machout's lib.





> bug when running 
> org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop
> -------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-476
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-476
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.3
>         Environment: hadoop 0.20.2
> mahout-0.3
> ubuntu
>            Reporter: leon lee
>             Fix For: 0.3
>
>
> when I follow wiki instruction: 
> https://cwiki.apache.org/MAHOUT/wikipedia-bayes-example.html 
> (by the way, the bayes examples document in wiki  need update to 0.3 )
> to run step 5:
> Create the countries based Split of wikipedia dataset. 
> I use the following command:
> $HADOOP_HOME/bin/hadoop jar 
> $MAHOUT_HOME/examples/target/mahout-examples-0.3.job  
> org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver -i 
> $MAHOUT_HOME/examples/work/wikipedia/chunks -o 
> $MAHOUT_HOME/examples/work/wikipediainput  -c  
> $MAHOUT_HOME/examples/src/test/resources/country.txt
> and failed on hadoop.
> see hadoop log, it hint:
> Error: 
> org.apache.lucene.wikipedia.analysis.WikipediaTokenizer.addAttribute(Ljava/lang/Class;)Lorg/apache/lucene/util/Attribute

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to