Daniel Noll wrote:

I have filtered out lines in the log which indicated an exception adding the document; these occur when our Reader throws an IOException and there were so
many that it bloated the file.

OK, I think very likely this is the issue: when IndexWriter hits an exception while processing a document, the portion of the document already indexed is left in the index, and then its docID is marked for deletion. You can see these deletions in your infoStream:

  flush 0 buffered deleted terms and 30 deleted docIDs on 20 segments

This means you have deletions in your index, by docID, and so when you optimize the docIDs are then compacted.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to