Corrupt index: term out of order after forced stop during indexing
-------------------------------------------------------------------
Key: LUCENE-1037
URL: https://issues.apache.org/jira/browse/LUCENE-1037
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions: 2.0.1
Environment: Windows Server 2003
Reporter: Chuck Williams
In testing a reboot during active indexing, upon restart this exception
occurred:
Caused by: java.io.IOException: term out of order
("ancestorForwarders:".compareTo("descendantMoneyAmounts:$0.351") <= 0)
at org.apache.lucene.index.TermInfosWriter.add(TermInfosWriter.java:96)
at
org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:322)
at
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:289)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:253)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)
at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1398)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:835)
at ... (application code)
The "ancestorForwarders:" term has no text. The application never creates such
a term. It seems the reboot occurred while this term was being written, but
such a segment should not be linked into the index and so should not be visible
after restart.
The application uses parallel subindexes accessed with ParallelReader. This
reboot caught the system in a state where the indexes were out of sync, i.e. a
new document had parts indexed in one subindex but not yet indexed in another.
The application detects this condition upon restart, uses
IndexReader.deleteDocument() to delete the parts that were indexed from those
subindexes, and then does optimize() all all the subindexes to bring the
docid's back into sync. The optimize() failed, presumably on a subindex that
was being written at the time of the reboot. This subindex would not have
completed its document part and so no deleteDocument() would have been
performed on it prior to the optimize().
The version of Lucene here is from January 2007. I see one other reference to
this exception in LUCENE-848. There is a note there that the exception is
likely a core problem, but I don't see any follow up to track it down.
Any ideas how this could happen?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]