Re: optimize()

Leo Galambos Wed, 27 Nov 2002 11:21:13 -0800

> Unoptimized index is not a problem for document additions, they take
> constant time, regardless of the size of the index and regardless of
> whether the index is optimized or not.


IMHO It is not true. It would mean that O(log(n/M))=O(1).  (n-number of
documents in index, M max number of segments per level). I think that if
you are true, we are able to sort an array in O(n) and not in O(nlog n).

> Searches of unoptimized index take longer than searches of an optimized
> index.

Is there any limitation in Lucene architecture, so that you cannot use
multithread algorithm for calculation of hit lists? I think it would boost
performance. Otis, thank you for your proof, that Lucene has not it now
(you got me :-)). But what about next releases?

> Then do a search against one, and against the other index, and time it.
> Then let us know which one is faster and by how much.

OK, I will.

I would like to compare Lucene to another engine. The test would be
precise, because I wanna use it in an academic paper.

Aim of my question was, how could I configure Lucene to get maximum
performance for test. It looks to be pretty hard, because:

- if I do not call optimize(), I can build index at maximum speed, but 
searches are slow, so it is not configuration for dynamic environment

- if I call optimize() regularly (as real application would do), indexing
is slower and slower when I add more and more documents to the collection

IMHO the second option describes "real environment", so we get:

loop:
  K-times indexDoc()
  optimize()
end-of-loop

What *K* would I use? 1000, 10000 or 10 or 100? Folks, what *K* do you use 
in your applications? Thank you.

-g-



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: optimize()

Reply via email to