I was looking at the javadocs for the optimize() call on IndexWriter which contain a great amount of detail about what happens, but very little guidance on when. I would like to add more on when. I generally do optimize after I finish my indexing, which is pretty straightforward to determine when one has a more or less static collection. What isn't so clear to me, b/c I haven't dealt w/ it too much is when optimize should be called in environments that are frequently updated.

Here's what I have for text so far:
*
* <p>It is recommended that this method be called upon completion of indexing. In
   * environments with frequent updates optimize is best FILL IN HERE
   * </p>

Essentially, I am wondering what are the best practices for calling optimize, especially in a frequent update environment. My gut feeling is that it should just be scheduled to be done on a regular basis, ideally when there is a lull. The docs allude to the fact that search performance will be better, but has anyone quantified it? The mergeFactor docs say that a smaller merge factor results in faster searches on unoptimized (I presume that means relatively faster searches to higher merge factors, but still not as fast as optimized, correct?) If it hasn't been quantified, maybe I will try to whip a benchmark for it.

So, do people in these types of environment typically schedule optimize to occur at night or every few hours, or what? I know, "It depends...", just am wondering if there is a general consensus that would be useful to pass along to readers

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to