Hi Patrick, 4.10.1 will fix this, so you can read your 3.x indices again. See https://issues.apache.org/jira/browse/LUCENE-5975 for details...
Mike McCandless http://blog.mikemccandless.com On Tue, Sep 23, 2014 at 9:18 PM, Robert Muir <rcm...@gmail.com> wrote: > As reported in the issue, since 4.8 we do better checks when reading > this stuff in. > > Unfortunately, 3.0-3.3 indexes had bugs in the way they encode the > deleted documents. > > So for those indexes, we have to ignore the trailing garbage at the > end of the file. > > On Tue, Sep 23, 2014 at 9:15 PM, Patrick Mi <patrick...@touchpoint.co.nz> > wrote: >> Hi Robert/Uwe, >> >> I have tried v4.8 and v4.9 - not working either. >> >> V4.7.0, V4.7.1, v4.7.2 are good. >> >> Regards, >> Patrick >> >> -----Original Message----- >> From: Patrick Mi [mailto:patrick...@touchpoint.co.nz] >> Sent: Wednesday, 24 September 2014 12:24 p.m. >> To: 'java-user@lucene.apache.org' >> Subject: RE: How to configure lucene 4.x to read 3.x index files >> >> Hi Robert/Uwe, >> >> Thanks very much for the quick response. >> >> I have tried again with a different set of index(28k documents) generated >> from V3 too and that worked. >> >> But the one(30k documents) I tried indeed worked for the V3 but not V4.10. >> Maybe something in that index could cause problem in V4 but not v3. >> >> Also I have tried an earlier version v4.7 as Uwe suggested and V4.7 version >> works on the V3 index that V4.10 failed to open. >> >> Regards, >> >> Patrick >> >> >> >> -----Original Message----- >> From: Robert Muir [mailto:rcm...@gmail.com] >> Sent: Tuesday, 23 September 2014 11:52 p.m. >> To: java-user >> Subject: Re: How to configure lucene 4.x to read 3.x index files >> >> You should not have to configure anything. >> >> The exception should not happen: can I have this index to debug the issue? >> >> On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi >> <patrick...@touchpoint.co.nz> wrote: >>> Hi there, >>> >>> I understood that Lucene V4 could read 3.x index files by configuring >>> Lucene3xCodec but what exactly needs to be done here? >>> >>> I used DEMO code from V4.10.0 to generate v4 index files and could read >>> them >>> without problem. When I tried to read index files generated from V3 I got >>> the following errors: >>> >>> Exception in thread "main" org.apache.lucene.index.CorruptIndexException: >>> did not read all bytes from file: read 65 vs size 66 (resource: >>> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del"))) >>> at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252) >>> at >>> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363) >>> at >>> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91) >>> at >>> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116) >>> at >>> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62) >>> at >>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913) >>> at >>> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53) >>> at >>> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67) >>> at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95) >>> >>> My classpath includes the following jars from V4: >>> lucene-core-4.10.0.jar >>> lucene-analyzers-common-4.10.0.jar >>> lucene-queries-4.10.0.jar >>> lucene-queryparser-4.10.0.jar >>> lucene-facet-4.10.0.jar >>> lucene-expressions-4.10.0.jar >>> >>> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of >>> lucene-core-4.10.0.jar) contains the following lines: >>> org.apache.lucene.codecs.lucene40.Lucene40Codec >>> org.apache.lucene.codecs.lucene3x.Lucene3xCodec >>> org.apache.lucene.codecs.lucene41.Lucene41Codec >>> org.apache.lucene.codecs.lucene42.Lucene42Codec >>> org.apache.lucene.codecs.lucene45.Lucene45Codec >>> org.apache.lucene.codecs.lucene46.Lucene46Codec >>> org.apache.lucene.codecs.lucene49.Lucene49Codec >>> org.apache.lucene.codecs.lucene410.Lucene410Codec >>> >>> Does that mean Lucene3xCodec will be picked up automatically based on the >>> index files itself? >>> >>> Where is the API I could force the code to use V3 setting? IndexReader and >>> IndexSearcher don’t seem to have anywhere I can pass that in? >>> >>> Did some search but couldn't find the useful resources covered that. Much >>> appreciated if someone could point out the right direction. >>> >>> Regards, >>> Patrick >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org