Corrupt index:  term out of order after forced stop during indexing
-------------------------------------------------------------------

                 Key: LUCENE-1037
                 URL: https://issues.apache.org/jira/browse/LUCENE-1037
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 2.0.1
         Environment: Windows Server 2003
            Reporter: Chuck Williams


In testing a reboot during active indexing, upon restart this exception 
occurred:

Caused by: java.io.IOException: term out of order 
("ancestorForwarders:".compareTo("descendantMoneyAmounts:$0.351") <= 0)

        at org.apache.lucene.index.TermInfosWriter.add(TermInfosWriter.java:96)

        at 
org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:322)

        at 
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:289)

        at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:253)

        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)

        at 
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1398)

        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:835)

        at ...   (application code)

The "ancestorForwarders:" term has no text.  The application never creates such 
a term.  It seems  the reboot occurred while this term was being written, but 
such a segment should not be linked into the index and so should not be visible 
after restart.

The application uses parallel subindexes accessed with ParallelReader.  This 
reboot caught the system in a state where the indexes were out of sync, i.e. a 
new document had parts indexed in one subindex but not yet indexed in another.  
The application detects this condition upon restart, uses 
IndexReader.deleteDocument() to delete the parts that were indexed from those 
subindexes, and then does optimize() all all the subindexes to bring the 
docid's back into sync.  The optimize() failed, presumably on a subindex that 
was being written at the time of the reboot.  This subindex would not have 
completed its document part and so no deleteDocument() would have been 
performed on it prior to the optimize().

The version of Lucene here is from January 2007.  I see one other reference to 
this exception in LUCENE-848.  There is a note there that the exception is 
likely a core problem, but I don't see any follow up to track it down.

Any ideas how this could happen?


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to