On Sep 6, 2006, at 4:23 PM, Ning Li wrote:
When do you add "merge-worthy" segments? I'd guess at the end of a session, when it's easy to decide which segments are "merge-worthy".
Right. KS sorts the segments by size, then tries to merge the smallest away. The calculation uses the fibonacci series, the idea being to perform the least number of merges while keeping the number of segments manageable.
If so, however, a newer doc could get a smaller docid than an older doc, right? It's a nice property of Lucene that an older doc always has a smaller docid. I think some applications use this to decide newer/older versions of a document.
Correct. That information is not preserved with this algorithm.
This means no new documents are visible to IndexReader until a session is over. In some sense, "1 segment/commit per session" lets an application decide when a "merge" happens.
Yes. And since there's only one class in KinoSearch which modifies the index (InvIndexer), all adds and deletes are committed at the same time.
Marvin Humphrey Rectangular Research http://www.rectangular.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]