On Sep 6, 2006, at 4:23 PM, Ning Li wrote:

When do you add "merge-worthy" segments? I'd guess at the end of a
session, when it's easy to decide which segments are "merge-worthy".

Right. KS sorts the segments by size, then tries to merge the smallest away. The calculation uses the fibonacci series, the idea being to perform the least number of merges while keeping the number of segments manageable.

If so, however, a newer doc could get a smaller docid than an older
doc, right? It's a nice property of Lucene that an older doc always
has a smaller docid. I think some applications use this to decide
newer/older versions of a document.

Correct.  That information is not preserved with this algorithm.

This means no new documents are visible to IndexReader until a session
is over. In some sense, "1 segment/commit per session" lets an
application decide when a "merge" happens.

Yes. And since there's only one class in KinoSearch which modifies the index (InvIndexer), all adds and deletes are committed at the same time.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to