I know that the size of a Lucene index can double while optimization is underway, but it's supposed to eventually settle back down to the original size, correct? We have a Lucene index consisting of 100K documents, that is normally about 12GB in size. It is split across 10 sub-indexes which we search using MultiSearcher. It takes our system about 7 hours to traverse the file system and update the index, which typically adds, updates or deletes anywhere from a dozen to a few hundred documents. We optimize each sub-index at the end (although this is configurable). The system seems to run fine for several days, with the total size of the index staying fairly consistent, then all of the sudden the index size doubles to about 25GB, and stays there. I'm assuming this is happening after an optimization-there is certainly not a doubling of the data that is being added.
Is this expected or known behavior, or a bug of some kind? I've read various postings on the 'net regarding optimization, and when to do it, if at all, and I'm certainly open to other strategies. Search time is critical for our users. FWIW, we have the following tunable parameters configured for our index: mergeFactor: 5 maxMergeDocs: 1000 maxBufferedDocs: 200 RAMBufferSizeMB: 16 Any advice or help is appreciated.