Re: ConcurrentMergeScheduler and MergePolicy question

Mark Miller Thu, 30 Jul 2009 07:55:48 -0700

bq. we've always said to keep the merge factor small for search reasons, at
least in the high-update case.
I think we have been wrong. A bunch of segments vs optimized is about the
same speed I think. I'd always read that to, but Mike said it didn't make
sense once, and some simple testing seemed to prove it out. Which means you
probably want a little tail of segments usually. Reopen likes segments (as
opposed to an optimized index), and you won't be merging into the largest
segments as often. And search speed shouldn't suffer. What did suffer was
opening a FieldCache on a multi-segment index - that was a major speed trap
- but now, with per segment searching, its not really an issue anymore. I
think yonik may have alleviated that issue as well with a patch?


-- 
- Mark

http://www.lucidimagination.com

On Thu, Jul 30, 2009 at 2:48 PM, Grant Ingersoll <gsing...@apache.org>wrote:

> Note also response from Mike that talks a little bit about something along
> these lines:
> http://www.lucidimagination.com/search/document/fa990adba4d2572b/is_there_a_way_to_control_when_merges_happen#f6f0bfeef4bf9a39
>
> -Grant
>
>
> On Jul 30, 2009, at 10:35 AM, Grant Ingersoll wrote:
>
>  Given a large segment and a bunch of small segments, how does the
>> ConcurrentMergeScheduler (CMS) work?  Does it always merge the smaller
>> segments into the bigger one, or does it merge the smaller segments
>> together?
>>
>> Something I've been thinking about:  Given a high update environment (and
>> near real time, less than 1 minute, search constraints) and/or a very bursty
>> environment, we've always said to keep the merge factor small for search
>> reasons, at least in the high-update case.  However, I've seen a couple of
>> times where this causes problems because merges can take over and cause
>> pauses, even with CMS, so I am wonder if it makes sense to have a larger
>> merge factor (>10), knowing that I may have a few large segments and then a
>> bunch of small ones and that the CMS will, in the background, be able to
>> keep merging the smaller segments together and in most cases avoid ever
>> having to merge into the large segments (b/c maybe I can just optimize down
>> at slower times or even merge larger segments later. )   Seems like this
>> would allow one to make sure larger merges need not take place, or at least
>> reduce the chances of that happening.
>>
>> Not sure if I worded that correctly.
>>
>> Thanks,
>> Grant
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

Re: ConcurrentMergeScheduler and MergePolicy question

Reply via email to