[
https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384581#comment-15384581
]
Suneel Marthi commented on MAHOUT-1876:
---------------------------------------
To add more context to the previous post, we tried moving to Lucene 4.10.x back
in March 2015 but that completely broke the vectorization code in the legacy
mapreduce and was failing all tests.
If u r willing to take a stab at this and upgrade to Lucene 6.1.x, please reach
out on dev@ and we can talk there.
> Mahout fails to read from lucene index of solr-6.1.0
> ----------------------------------------------------
>
> Key: MAHOUT-1876
> URL: https://issues.apache.org/jira/browse/MAHOUT-1876
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.12.2
> Environment: Solr: 6.1.0
> JDK: 1.8.0_92
> Mahout: 0.12.2
> OS: Linux
> Reporter: Raviteja Lokineni
>
> Command: {noformat}bin/mahout lucene.vector --dir
> ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output
> /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut
> /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
> Stacktrace:
> {noformat}
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Exception in thread "main"
> org.apache.lucene.index.IndexFormatTooNewException: Format version is not
> supported (resource:
> ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))):
> 6 (needs to be between 0 and 1)
> at
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
> at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
> at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
> at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
> at
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
> at
> org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
> at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)