I am just going to wax philosophical for a minute. I am trying to understand lucene's merging algorithm in depth.
Let's say I create an index of 25M web pages on a single machine. While creating this index I am doing both search and indexing / re-indexing at the same time, a bit like Technorati (http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg12709.htm l), and so I assume a merge factor of 2, as Doug references in the Technorati post. According to the algorithm described in Doug's pisa talk: http://lucene.sourceforge.net/talks/pisa/ the average number of indices when I get near 25M docs is: 2 * log2(25,000,000) / 2 = 24.6 indices. According to the FAQ, http://wiki.apache.org/jakarta-lucene/LuceneFAQ, there is no way to set a hard limit on the number of indices. It seems to me that it might be nice to be able to impose a hard limit on the number of segments, in this case, perhaps 5 or 10 indices. Does anyone have any experience with this? Has anyone tried imposing a hard limit? Thanks, - Jeff --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]