oh, sorry. It's our bug. we found it. CheckIndex is a tool to check the correctness of index? we check our modification by generating random documents and randomly test 3 main methods of SegmentTermDocs: next, read and skipTo. But the data is not "random" enough that some extreme positions are not tested so our modification make the index corrupted.
2011/1/15 Michael McCandless <luc...@mikemccandless.com>: > Different ramBufferSizeMB during indexing should never cause corruption! > > Can you try setting the ram buffer to 256 MB in your test env and see > if that makes the corruption go away? > > This could also be a hardware issue in your test env. If you run > CheckIndex on the corrupt index does it always fail in the same way? > > Mike > > On Fri, Jan 14, 2011 at 6:43 AM, Li Li <fancye...@gmail.com> wrote: >> hi all, >> we have confronted this problem 3 times when testing >> The exception stack is >> Exception in thread "Lucene Merge Thread #2" >> org.apache.lucene.index.MergePolicy$MergeException: >> org.apache.lucene.index.CorruptIndexException: docs out of order (7286 >> <= 7286 ) >> at >> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355) >> at >> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319) >> Caused by: org.apache.lucene.index.CorruptIndexException: docs out of >> order (7286 <= 7286 ) >> at >> org.apache.lucene.index.FormatPostingsDocsWriter.addDoc(FormatPostingsDocsWriter.java:75) >> at >> org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:880) >> at >> org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818) >> at >> org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756) >> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187) >> at >> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354) >> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4937) >> >> Or >> Exception in thread "Lucene Merge Thread #0" >> org.apache.lucene.index.MergePolicy$MergeException: >> java.lang.ArrayIndexOutOfBoundsException: 330 >> at >> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355) >> at >> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319) >> Caused by: java.lang.ArrayIndexOutOfBoundsException: 330 >> at org.apache.lucene.util.BitVector.get(BitVector.java:102) >> at >> org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:238) >> at >> org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:168) >> at >> org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98) >> at >> org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:870) >> at >> org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818) >> at >> org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756) >> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187) >> at >> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354) >> >> >> We did some minor modification based on lucene 2.9.1 and solr >> 1.4.0. we modified frq file to store 4 bytes for the positions of the >> term occured >> in these document(Accessing full postions in prx is time consuming >> that can't meed our needs). I can't tell it's our bug or lucene's own >> bug. >> I searched the mail list and found the mail "problem during index >> merge" posted in 2010-10-21. It's similar to our case. >> It seems the docList in frq file is wrongly stored. When Merging, >> when it's decoded, the wrong docID many larger than maxDocs(BitVector >> deletedDocs) >> which cause the second exception. Or docID delta is less than 0(it >> reads wrongly) which cause the first exception >> we are still continueing testing to turn off our modification and >> open infoStream in solr-config.xml >> >> We found a strange phenomenon. when we test, it sometimes hited >> exceptions but in our production environment, it never hit any. >> the hardware and software environments are the same. We checked >> carefully and find the only difference is this line in solr-config.xml >> <ramBufferSizeMB>32</ramBufferSizeMB> in testing environment >> <ramBufferSizeMB>256</ramBufferSizeMB>in production environment >> The indexed documents number for each machine is also roughly the >> same. 10M+ documents. >> I can't make sure the indice in production env are correct because >> even there are some terms' docList are wrong, if the doc delta >0 and >> don't have >> some deleted documents, it will not hit the 2 exceptions. >> The search results in production env and we don't find any strange results. >> >> Will when the ramBufferSizeMB is too small results in index corruption? >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org