Daniel Noll wrote:
I have filtered out lines in the log which indicated an exception
adding the
document; these occur when our Reader throws an IOException and
there were so
many that it bloated the file.
OK, I think very likely this is the issue: when IndexWriter hits an
exception while processing a document, the portion of the document
already indexed is left in the index, and then its docID is marked
for deletion. You can see these deletions in your infoStream:
flush 0 buffered deleted terms and 30 deleted docIDs on 20 segments
This means you have deletions in your index, by docID, and so when
you optimize the docIDs are then compacted.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]