addIndexes doesn't call maybeMerge, so i think you are just getting in a situation with too many segments, so applying deletes is slow.
can you try calling IndexWriter.maybeMerge() after you call addIndexes? (it wont have immediate impact, you have to do some merges to get your index healthy again). On Wed, May 14, 2014 at 11:46 AM, danielv <dani...@exlibris.co.il> wrote: > Hi, > > We have about 550M records index (~800GB) and we merge thousands of mini > indexes once a week using hadoop - 45 mappers on 2 hadoop nodes. > After upgrading to Lucene 3.6.1 we noticed that the merge process > continuously slowing down. > After we test a couple of options it looks like we found the source of the > problem but have no idea how to fix it. > What we do - first we merge all mini-indexes to one intermediate mini-index, > and than this one to the big (final) one. > The difference is deleted_records existence in mini-index: > In case we have no deleted_records from merged mini-indexes - merger run > about 2h with about 05s-2s per mini-index > If we have deleted_records - after about 10 minutes we see dramatic > degradation in time of merging mini-indexes to intermediate one (if first > 100-200 mini-indexes merge take less than a second, after 10 minutes is take > more than 10s for one mini-index and after hour or two it is a couple of > minutes!) > > This one from jstack of mapper: > > java.lang.Thread.State: RUNNABLE > at java.lang.Thread.isAlive(Native Method) > at > org.apache.lucene.util.CloseableThreadLocal.purge(CloseableThreadLocal.java:115) > - locked <0x00000007db0d6140> (a java.util.WeakHashMap) > at > org.apache.lucene.util.CloseableThreadLocal.maybePurge(CloseableThreadLocal.java:105) > at > org.apache.lucene.util.CloseableThreadLocal.get(CloseableThreadLocal.java:88) > at > org.apache.lucene.index.TermInfosReader.getThreadResources(TermInfosReader.java:160) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:184) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172) > at > org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:66) > at > org.apache.lucene.index.BufferedDeletesStream.applyTermDeletes(BufferedDeletesStream.java:346) > - locked <0x00000007805766f0> (a > org.apache.lucene.index.BufferedDeletesStream) > at > org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:248) > - locked <0x00000007805766f0> (a > org.apache.lucene.index.BufferedDeletesStream) > at > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3615) > - locked <0x00000007805739a0> (a > org.apache.lucene.index.IndexWriter) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3552) > at > org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3120) > at > org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:3064) > > We try to use org.apache.lucene.index.IndexWriter.addIndexes instead of > org.apache.lucene.index.IndexWriter.addIndexesNoOptimize - same behavior. > > How can we eliminate this behavior and get improvement in performance of our > merge? > > Thanks! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Merger-performance-degradation-on-3-6-1-tp4135593.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org