IndexWriter.addIndexes & optimization

Benjamin Stein Wed, 07 Jun 2006 12:44:14 -0700

I have a very large corpus that I am storing in many indexes: 200 indexes
* ~500MB each, with 10^6 very tiny documents in each.  (I could look into
optimizing this later, of course, but seems ok for now)


During indexing, I have been using a RAMDirectory to store many thousands of
documents in memory before flushing the buffer to disk using
IndexWriter.addIndexes.
For the most part this works very well, except that performance degrades
tremendously over time due to the implicit call (or two!)
to optimize() inside the addIndexes function.

I searched the archives and found that this topic has come up a number of
times over the years, but with very few answers (except for one amusing one
from Doug:  "I don't recall exactly why this was done. (I should have
written a comment!)"  :)

Is there a way to accomplish what I'm trying to do without all the calls to
optimize()?

-Ben

IndexWriter.addIndexes & optimization

Reply via email to