I also don't know if there are any negative performance implications
of merging segments with sizes an order of magnitude apart.
It should be relatively easy to test different scenarios by
manipulating mergeFactor and maxBufferedDocs at the right time.

I agree. In addition, it's not clear to me how much improvement this
will make for a large data set.

Flushing ram segments separately allows an application to see new
documents sooner. Those documents don't have to be merged with other
on-disk documents before being available for search.

Flushing ram segments separately also has the following benefits as
Yonik put down before committing LUCENE-672 (flushing ram segments
separately is one part of that commit):

- flushing all ram segments separately from disk segments allows more
efficient implementations of combination reader/writers (like buffered
deletes) because docids won't change from the flush alone (a merge is
needed to change ids)
- flushing all buffered docs together leaves more optimization
possibilities... something other than single-doc segments could be
used to buffer in-mem docs in the future.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to