hi all,
   we have confronted this problem 3 times when testing
   The exception stack is
Exception in thread "Lucene Merge Thread #2"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: docs out of order (7286
<= 7286 )
        at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319)
Caused by: org.apache.lucene.index.CorruptIndexException: docs out of
order (7286 <= 7286 )
        at 
org.apache.lucene.index.FormatPostingsDocsWriter.addDoc(FormatPostingsDocsWriter.java:75)
        at 
org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:880)
        at 
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818)
        at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187)
        at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4937)

    Or
Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.ArrayIndexOutOfBoundsException: 330
        at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 330
        at org.apache.lucene.util.BitVector.get(BitVector.java:102)
        at 
org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:238)
        at 
org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:168)
        at 
org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
        at 
org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:870)
        at 
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818)
        at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187)
        at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354)


   We did some minor modification based on lucene 2.9.1 and solr
1.4.0. we modified frq file to store 4 bytes for the positions of the
term occured
in these document(Accessing full postions in prx is time consuming
that can't meed our needs). I can't tell it's our bug or lucene's own
bug.
   I searched the mail list and found the mail "problem during index
merge" posted in 2010-10-21. It's similar to our case.
   It seems the docList in frq file is wrongly stored. When Merging,
when it's decoded, the wrong docID many larger than maxDocs(BitVector
deletedDocs)
which cause the second exception. Or docID delta is less than 0(it
reads wrongly) which cause the first exception
   we are still continueing testing to turn off our modification and
open infoStream in solr-config.xml

   We found a strange phenomenon. when we test, it sometimes hited
exceptions but in our production environment, it never hit any.
   the hardware and software environments are the same. We checked
carefully and find the only difference is this line in solr-config.xml
  <ramBufferSizeMB>32</ramBufferSizeMB>  in testing environment
  <ramBufferSizeMB>256</ramBufferSizeMB>in production environment
  The indexed documents number for each machine is also roughly the
same. 10M+ documents.
  I can't make sure the indice in production env are correct because
even there are some terms' docList are wrong, if the doc delta >0  and
don't have
some deleted documents, it will not hit the 2 exceptions.
  The search results in production env and we don't find any strange results.

  Will when  the ramBufferSizeMB is too small results in index corruption?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to