I also don't know if there are any negative performance implications of merging segments with sizes an order of magnitude apart. It should be relatively easy to test different scenarios by manipulating mergeFactor and maxBufferedDocs at the right time.
I agree. In addition, it's not clear to me how much improvement this will make for a large data set. Flushing ram segments separately allows an application to see new documents sooner. Those documents don't have to be merged with other on-disk documents before being available for search. Flushing ram segments separately also has the following benefits as Yonik put down before committing LUCENE-672 (flushing ram segments separately is one part of that commit): - flushing all ram segments separately from disk segments allows more efficient implementations of combination reader/writers (like buffered deletes) because docids won't change from the flush alone (a merge is needed to change ids) - flushing all buffered docs together leaves more optimization possibilities... something other than single-doc segments could be used to buffer in-mem docs in the future. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]