Phew!  Thanks for bringing closure!

Yes, CheckIndex checks for index corruption.  Best to run it with
assertions enabled...

Unfortunately random testing has limitations... I've seen similar
cases, where stuff passes the random tests but then only on
building/searching a "real" index (10M wikipedia docs) does something
go wrong...

Biasing the random generation towards the "extrema" might sometimes help.

Mike

On Sun, Jan 16, 2011 at 6:48 AM, Li Li <fancye...@gmail.com> wrote:
> oh, sorry. It's our bug. we found it.
> CheckIndex is a tool to check the correctness of index?
> we check our modification by generating random documents and randomly test
> 3 main methods of SegmentTermDocs: next, read and skipTo. But the data is
> not "random" enough that some extreme positions are not tested so our
> modification
> make the index corrupted.
>
> 2011/1/15 Michael McCandless <luc...@mikemccandless.com>:
>> Different ramBufferSizeMB during indexing should never cause corruption!
>>
>> Can you try setting the ram buffer to 256 MB in your test env and see
>> if that makes the corruption go away?
>>
>> This could also be a hardware issue in your test env.  If you run
>> CheckIndex on the corrupt index does it always fail in the same way?
>>
>> Mike
>>
>> On Fri, Jan 14, 2011 at 6:43 AM, Li Li <fancye...@gmail.com> wrote:
>>> hi all,
>>>   we have confronted this problem 3 times when testing
>>>   The exception stack is
>>> Exception in thread "Lucene Merge Thread #2"
>>> org.apache.lucene.index.MergePolicy$MergeException:
>>> org.apache.lucene.index.CorruptIndexException: docs out of order (7286
>>> <= 7286 )
>>>        at 
>>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355)
>>>        at 
>>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319)
>>> Caused by: org.apache.lucene.index.CorruptIndexException: docs out of
>>> order (7286 <= 7286 )
>>>        at 
>>> org.apache.lucene.index.FormatPostingsDocsWriter.addDoc(FormatPostingsDocsWriter.java:75)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:880)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187)
>>>        at 
>>> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354)
>>>        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4937)
>>>
>>>    Or
>>> Exception in thread "Lucene Merge Thread #0"
>>> org.apache.lucene.index.MergePolicy$MergeException:
>>> java.lang.ArrayIndexOutOfBoundsException: 330
>>>        at 
>>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355)
>>>        at 
>>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319)
>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 330
>>>        at org.apache.lucene.util.BitVector.get(BitVector.java:102)
>>>        at 
>>> org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:238)
>>>        at 
>>> org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:168)
>>>        at 
>>> org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:870)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756)
>>>        at 
>>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187)
>>>        at 
>>> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354)
>>>
>>>
>>>   We did some minor modification based on lucene 2.9.1 and solr
>>> 1.4.0. we modified frq file to store 4 bytes for the positions of the
>>> term occured
>>> in these document(Accessing full postions in prx is time consuming
>>> that can't meed our needs). I can't tell it's our bug or lucene's own
>>> bug.
>>>   I searched the mail list and found the mail "problem during index
>>> merge" posted in 2010-10-21. It's similar to our case.
>>>   It seems the docList in frq file is wrongly stored. When Merging,
>>> when it's decoded, the wrong docID many larger than maxDocs(BitVector
>>> deletedDocs)
>>> which cause the second exception. Or docID delta is less than 0(it
>>> reads wrongly) which cause the first exception
>>>   we are still continueing testing to turn off our modification and
>>> open infoStream in solr-config.xml
>>>
>>>   We found a strange phenomenon. when we test, it sometimes hited
>>> exceptions but in our production environment, it never hit any.
>>>   the hardware and software environments are the same. We checked
>>> carefully and find the only difference is this line in solr-config.xml
>>>  <ramBufferSizeMB>32</ramBufferSizeMB>  in testing environment
>>>  <ramBufferSizeMB>256</ramBufferSizeMB>in production environment
>>>  The indexed documents number for each machine is also roughly the
>>> same. 10M+ documents.
>>>  I can't make sure the indice in production env are correct because
>>> even there are some terms' docList are wrong, if the doc delta >0  and
>>> don't have
>>> some deleted documents, it will not hit the 2 exceptions.
>>>  The search results in production env and we don't find any strange results.
>>>
>>>  Will when  the ramBufferSizeMB is too small results in index corruption?
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to