Try to measure the time up to the moment you close the writer. Optimizing is time consuming and also not necessary. Optimizing forces Lucene to merge index segments and to renumber all documents. The difference in query speed after optimizing is in your case not noticable (I think, you can check this in comparing query-time with optimized and non optimized index).
> -----Ursprüngliche Nachricht----- > Von: Sairaj Sunil [mailto:[EMAIL PROTECTED] > Gesendet: Montag, 12. Februar 2007 13:44 > An: [email protected] > Betreff: Re: Merge factor question > > Hi > I have not traced the memory usage. > i have one question. what is the difference between batch > indexing and interactive indexing. may be this is too silly > to ask , but nevertheless i want to make it clear. because if > i reduce the merge factor below 10 (for example 5), the > performance has improved slightly. > i am indexing the documents all at once. i.e., I open the > writer and add the documents in the end optimize and then close. > > > On 2/12/07, Jokin Cuadrado <[EMAIL PROTECTED]> wrote: > > > > the document number don't matter, the merge factor is the > max number > > of documents that will be maintained in memory, so both > 1000 documents > > and 200 documents will have a maximum of 50 documents (with theirs > > terms vectors etc.) in memory, losing performance as i said > if you hit > > the virtual memory. > > > > Have you traced the memory usage, the page faults, memory > used and so on? > > > > another thing that could help in performance, is the usage of > > stop-words. have you take a look to the resultant index information > > with luke, to watch if in the top terms you have common > words as "and" > > "the" "is". these words are very common, and if you get rid > of them, > > the performance will also be increased. > > > > > > hope it helps you. > > > > -- > > Jokin. > > > > > > > > > -- > Sairaj Sunil > II Mtech(CS) >
