Mahout 0.4 uses an older version of Lucene, and so it won't be able to read an index created by Lucene 3.1.0. Try using trunk, which uses 3.1.
-Grant On May 24, 2011, at 3:20 AM, hailong.yang1115 wrote: > Dear all, > > I am using mahout to convert the lucene index into the vectors needed by > clustering algorithm. However, I got the following errors: > > [hailong@node125 benchmark]$ mahout lucene.vector --dir index/ --field body > --dictOut ./dict.txt --output ./out.txt > Running on hadoop, using HADOOP_HOME=/home/hailong/hadoop-0.20.2 > No HADOOP_CONF_DIR set, using /home/hailong/hadoop-0.20.2/conf > Exception in thread "main" org.apache.lucene.index.CorruptIndexException: > Unknown format version: -11 > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:249) > at > org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:73) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:677) > at > org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69) > at org.apache.lucene.index.IndexReader.open(IndexReader.java:316) > at org.apache.lucene.index.IndexReader.open(IndexReader.java:202) > at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:157) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > The mahout version is 0.4 and the lucene version is 3.1.0. Any help will be > appreciated. > > > Hailong > > 2011-05-24 > > > > *********************************************** > * Hailong Yang, PhD. Candidate > * Sino-German Joint Software Institute, > * School of Computer Science&Engineering, Beihang University > * Phone: (86-010)82315908 > * Email: [email protected] > * Address: G413, New Main Building in Beihang University, > * No.37 XueYuan Road,HaiDian District, > * Beijing,P.R.China,100191 > *********************************************** -------------------------- Grant Ingersoll
