Steve, Mike,

Thanks for the explanation! I meant cascading but wrote optimizing. So
it still cascades merges.

It would merge based on size (not # docs), would be free to merge
adjacent segments (not just rightmost segments), and would merge N
(configurable) at a time.  The part that's still unclear is how it
chooses when to "trigger" a merge and how specifically it picks which
N segments to merge (maybe: the series of N adjacent segments that are
"most similar" in size, but favoring smaller segments over larger
ones).

Those two are very good questions. It's a challenge to make it work in
all case. One example is the sandwich case, where two large segments
sandwich a small one. I'll think about it... It'd be even better if we
can take deletes into consideration: it's more beneficial to merge a
segment with more deletes. Right now, we have to open an IndexReader
to get the number of deletes. We could store that in segments file if
we decide IndexWriter/MergePolicy will need that...

Ning

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to