Hi Sean,

In fact i was using lucene version 3.6.0 (saw that in the pom.xml)
But in my classpath I was using lucene version 4.0.0

I change pom.xml to 4.0.0 => <lucene.version>4.0.0</lucene.version>

But still the same error: 
###
Exception in thread "main" java.lang.VerifyError: class 
org.apache.mahout.vectorizer.DefaultAnalyzer overrides final method 
tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
###

Should I change something else? Or may be lucene 4.0 is too recent for mahout!?



Thank you

-----Message d'origine-----
De : Sean Owen [mailto:[email protected]] 
Envoyé : mercredi 18 juillet 2012 22:52
À : [email protected]
Objet : Re: .txt to vector

This means you're using it with an incompatible version of Lucene. I think 
we're on 3.1. Check the version that Mahout depends upon and use at least that 
version or later.

On Wed, Jul 18, 2012 at 6:04 PM, Videnova, Svetlana < 
[email protected]> wrote:

> I'm working with mahout. I'm trying to do web service in java by 
> myself who will take the output of solr and give this file to mahout. 
> For the moment I successfully do the recommendation part.
> Now I'm trying to clusterise. For this I have to vectorise the output 
> of solr.
> Do you have any idea how to do it please? I was following 
> https://cwiki.apache.org/MAHOUT/creating-vectors-from-text.html
> BUT : doesn't work very well (at all...).
>
> I'm trying to find how to transform .txt to vector for mahout in order 
> to clusterise and categorise my information. Is it possible? I saw 
> that I have to use seqdirectory And seq2sparse.
>
> Seqdirectory create a file (with some numbers and everything...) this 
> step is ok But then when I have to use seq2sparse that gives me this 
> error:
>
> csi@csi-SCENIC-W:/usr/local/apache-mahout-d6d6ee8$ ./bin/mahout 
> seq2sparse --input ./examples/output/ --output ./toto/output/ hadoop 
> binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running 
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/usr/local/apache-mahout-d6d6ee8/examples/target/mahout-exam
> ples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/usr/local/apache-mahout-d6d6ee8/examples/target/dependency/
> slf4j-jcl-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/usr/local/apache-mahout-d6d6ee8/examples/target/dependency/
> slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 12/07/18 15:53:33 INFO vectorizer.SparseVectorsFromSequenceFiles: 
> Maximum n-gram size is: 1
> 12/07/18 15:53:33 INFO vectorizer.SparseVectorsFromSequenceFiles: 
> Minimum LLR value: 1.0
> 12/07/18 15:53:33 INFO vectorizer.SparseVectorsFromSequenceFiles: 
> Number of reduce tasks: 1 Exception in thread "main" 
> java.lang.VerifyError: class 
> org.apache.mahout.vectorizer.DefaultAnalyzer overrides final method 
> tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
>                 at java.lang.ClassLoader.defineClass1(Native Method)
>                 at
> java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>                 at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>                 at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>                 at
> java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>                 at
> java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>                 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>                 at java.security.AccessController.doPrivileged(Native
> Method)
>                 at
> java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>                 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>                 at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>                 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>                 at
> org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:199)
>                 at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>                 at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>                 at
> org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:55)
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>                 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>                 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>                 at java.lang.reflect.Method.invoke(Method.java:597)
>                 at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>                 at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>                 at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>
> im using only lucene 4.0!
>
> CLASSPATH=/opt/lucene-4.0.0-ALPHA/demo/lucene-demo-4.0.0-ALPHA.jar:/opt/lucene-4.0.0-ALPHA/core/lucene-core-4.0.0-ALPHA.jar:/opt/lucene-4.0.0-ALPHA/analysis/common/lucene-analyzers-common-4.0.0-ALPHA.jar:/opt/lucene-4.0.0-ALPHA/queryparser/lucene-queryparser-4.0.0-ALPHA.jar:.
>
> Please where im wrong?
>
>
> Thank you all
> Regards
>
>
>
>
>
>
> Think green - keep it on the screen.
>
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. It may contain proprietary material, confidential 
> information and/or be subject to legal privilege. It should not be 
> copied, disclosed to, retained or used by, any other party. If you are 
> not an intended recipient then please promptly delete this e-mail and 
> any attachment and all copies and inform the sender. Thank you.
>
>

Think green - keep it on the screen.

This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender. Thank you.

Reply via email to