Error: ... overrides final method tokenStream

Tristan Slominski Fri, 06 Apr 2012 06:22:08 -0700

Hello group,

I managed to get Mahout running.. awesome! But I keep on running into
issues that break Hadoop jobs that Mahout launches.


For example, when I follow the wikipedia Naive Bayes example, during the
wikipediaDataSetCreator step, my Hadoop jobs fail due to:

Error: class org.apache.lucene.analysis.ReusableAnalyzerBase overrides
final method
tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;

So, I decided to try the examples in the example folder within Mahout.

The classify-20newsgroups.sh example works just fine.

Then I try to run the cluster-reuters.sh example and Hadoop jobs break with:

Error: class org.apache.mahout.vectorizer.DefaultAnalyzer overrides final
method
tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;

I did this on latest Mahout 7.0 Snapshot built from source, and on the
packaged Mahout 6.0.

>From reading about it, it appears that the problem stems from the Lucene
project enforcing a final restriction on
org.apache.lucene.analysis.TokenStream . So, in order to try to at least
get it to run despite that restriction, I attempted to find a way to build
lucene-analysis project from scratch to generate a separate jar that
doesn't have the final restriction, but I'm sort of lost in the size of
that project right now.

What are you doing to get around this issue? Am I doing something wrong?
Using a wrong version of something perhaps? Again, I've build latest 7.0
Snapshot from source and I used packaged Mahout 6.0 with same problems.

Cheers,

Tristan

Error: ... overrides final method tokenStream

Reply via email to