[ 
https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384581#comment-15384581
 ] 

Suneel Marthi commented on MAHOUT-1876:
---------------------------------------

To add more context to the previous post, we tried moving to Lucene 4.10.x back 
in March 2015 but that completely broke the vectorization code in the legacy 
mapreduce and was failing all tests. 

If u r willing to take a stab at this and upgrade to Lucene 6.1.x, please reach 
out on dev@ and we can talk there.

> Mahout fails to read from lucene index of solr-6.1.0
> ----------------------------------------------------
>
>                 Key: MAHOUT-1876
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1876
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.12.2
>         Environment: Solr: 6.1.0
> JDK: 1.8.0_92
> Mahout: 0.12.2
> OS: Linux
>            Reporter: Raviteja Lokineni
>
> Command: {noformat}bin/mahout lucene.vector --dir 
> ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output 
> /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut 
> /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
> Stacktrace:
> {noformat}
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running 
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Exception in thread "main" 
> org.apache.lucene.index.IndexFormatTooNewException: Format version is not 
> supported (resource: 
> ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))):
>  6 (needs to be between 0 and 1)
>         at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
>         at 
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
>         at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
>         at 
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
>         at 
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
>         at 
> org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
>         at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to