Opened LUCENE-2086. Mike
On Fri, Nov 20, 2009 at 9:43 AM, Michael McCandless <luc...@mikemccandless.com> wrote: > +1 > > I'll open an issue. > > Mike > > On Fri, Nov 20, 2009 at 8:11 AM, Yonik Seeley > <yo...@lucidimagination.com> wrote: >> Thanks Bogdan, I've been meaning to bring this up. >> Solr used a TreeMap in the past (when it handled it's own deletes) for >> the same exact reason. In my profiling, I've also seen applyDeletes() >> taking the bulk of the time with small/simple document indexing. >> >> So we should definitely go in sorted order (either via TreeMap or sort >> the HashMap). >> >> -Yonik >> http://www.lucidimagination.com >> >> On Fri, Nov 20, 2009 at 7:21 AM, Bogdan Ghidireac <bog...@ecstend.com> wrote: >>> Hi, >>> >>> One of the use case of my application involves updating the index with >>> 10 to 10k docs every few minutes. Because we maintain a PK for each >>> doc we have to use IndexWriter.updateDocument to be consistent. >>> >>> The average time for an update when we commit every 10k docs is around >>> 17ms (the IndexWriter buffer is 100MB). I profiled the application for >>> several hours and I noticed that most of the time is spent in >>> IndexWriter.applyDeletes()->TermDocs.seek(). I changed the >>> BufferedDeletes.terms from HashMap to TreeMap to have the terms >>> ordered and to reduce the number of random seeks on the disk. >>> >>> I run my tests again with the patched Lucene 2.9.1 and the time has >>> dropped from 17ms to 2ms. The index has 18GB and 70 million docs. >>> >>> I cannot send a patch because my company has some strict and time >>> consuming policies about open source but the change is small and can >>> be applied easily. >>> >>> Regards, >>> Bogdan >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org