Just brainstorming a little... Assuming B=1000, M=10 (I think better with concrete examples)
It seems like we should avoid unnecessary merging, allowing up to 9 segments of 1000 documents or less w/o merging. When we reach 10 segments, they should be merged into a single segment. Let's assume a segment of size 8500 is created by the merge. Assume we write another 10 full segments that are merged into a bigger segment of size 10,000. It *feels* like: 1) we should be able to write full segments of 1000 docs, or less than that if closing the writer. 2) we should be able to write a full segment of 1000 docs *after* a non-full segment w/o having to merge 3) 10,000 and 8,500 should be at the same index level, not different levels 4) 1000 and 999 docs should be at the same index level So, I *think* most of our hypothetical problems go away with a simple adjustment to f(n): f(n) = floor(log_M((n-1)/B)) Right? That allows us to write all buffered docs separately (necessary for easy deletions), allows us to only merge M segments at a time (decreases number of merges), and allows us to maintain a monotonically decreasing f(n). -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]