Given a large segment and a bunch of small segments, how does the
ConcurrentMergeScheduler (CMS) work? Does it always merge the smaller
segments into the bigger one, or does it merge the smaller segments
together?
Something I've been thinking about: Given a high update environment
(and near real time, less than 1 minute, search constraints) and/or a
very bursty environment, we've always said to keep the merge factor
small for search reasons, at least in the high-update case. However,
I've seen a couple of times where this causes problems because merges
can take over and cause pauses, even with CMS, so I am wonder if it
makes sense to have a larger merge factor (>10), knowing that I may
have a few large segments and then a bunch of small ones and that the
CMS will, in the background, be able to keep merging the smaller
segments together and in most cases avoid ever having to merge into
the large segments (b/c maybe I can just optimize down at slower times
or even merge larger segments later. ) Seems like this would allow
one to make sure larger merges need not take place, or at least reduce
the chances of that happening.
Not sure if I worded that correctly.
Thanks,
Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org