I believe it takes constant time to add a new document to an index because when adding a new document a new segment is created on the disk, 'separate' from the other, existing, index segments. The size of the index may come into play when this new segment has to be merged with the existing segments, which happens every mergeFactor documents, so to speak. I have built indices with several hundred thousand documents, but never notices the increase in time to add a new document to an index. Maybe the difference was too small to notice. I don't have sufficient knowledge of Lucene to be able to stand behind this 100% and I could certainly be wrong :(.
Otis --- Leo Galambos <[EMAIL PROTECTED]> wrote: > > Adding a new document does not immediately modify an index, so the > time > > it takes to add a new document to an existing index is not > proportional > > to the index size. It is constant. The execution time of > optimize() > > is proportional to the index size, so you want to do that only if > you > > really need it. The Lucene article on http://www.onjava.com/ from > > March 5th describes this in more detail. > > Otis, > > I am not sure, if anything about constants is constant in > non-constant IR > systems :-) > > I think, that the correct answer is O(t/k*(1+log_m(k)), where t is a > time > you need to create&write one monolithic segment of k documents, m is > merge factor you use, and k is the number of documents which are > already > in index. As you can see, the function grows with k. > > Can you explain me, why addition of one document takes constant time? > > Thank you > > -g- > __________________________________________________ Do you Yahoo!? Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop! http://platinum.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
