Grant, i'm trying to generate the Sequence Vectors using the SnowballAnlyzer as opposed to the StandardAnlyzer. I've already gone through this process using the StandardAnlyzer and plotted the output clusters using the k-means dump file, so i'm familiar with clustering in Mahout. i'd like to repeat this exercise with the SnowballAnlyzer, running the following command.
./mahout seq2sparse -s 2 -a org.apache.lucene.anlysis.snowball.SnowballAnlyzer -chunk 100 -i /home/hadoop/tmp/trecdata-seqfiles/chunk-0 -o /home/hadoop/tmp/trecdata-vectors -md 1 -x 75 -wt TFIDF -n 0 1) i've placed the lucene-snowball jar in the m2 repository /home/delroy/.m2/repository/org/apache/lucene/lucene-snowball/2.9.1 2) and i also updated the Mahout_CORE/pom xml to reflect the dependency <!-- updated by Delroy to use Snowball Anlyzer --> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-snowball</artifactId> <version>2.9.1</version> </dependency> 3) then i did a mvn install on the Mahout_CORE and on Mahout_ROOT, which downloaded the lucene-snowball pom and lucene-snowball pom sha1 to the m2 repository this error seems to stem from developer code, which incidentally notes that you should not instantiate the anlyzer at SparseVectorsFromSequenceFiles.java:176 any suggestions here? Output: Exception in thread "main" java.lang.InstantiationException: org.apache.lucene.anlysis.snowball.SnowballAnlyzer at java.lang.Class.newInstance0(Class.java:357) at java.lang.Class.newInstance(Class.java:325) at org.apache.mahout.text.SparseVectorsFromSequenceFiles.main() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172) PS: I just love the spam filter..won't let me write too many variants of the word Analyzer because it contains the word anal. ----- --cheers Delroy -- View this message in context: http://n3.nabble.com/SnowballAnalyzer-tp729983p732912.html Sent from the Mahout User List mailing list archive at Nabble.com.