It appears that whenever I see a merge failure, I also apparently have a corrupt index (I get arrayindexoutofbounds exceptions when searching for certain things). So that may be the underlying cause of the merge infinite loop.
I've blown away the indexes repeatedly and tried to rebuild. I am now committing every 1000 records, and I can make this happen utterly reliably. The data contains quite a lot of unicode, and I've noted recent posts about tests failing in this area. Perhaps this is related? If this guess is correct, then there are two bugs. First bug is that index corruption causes merge to spin indefinitely, rather than error out. Second bug is that certain characters cause index corruption. With a bit of work I should be able to isolate the record that is the proximate cause of the index corruption. I will post it when I have it. Karl --- original message --- From: "Wright Karl (Nokia-MS/Cambridge)" <karl.wri...@nokia.com> Subject: RE: busywait hang using extracting update handler on trunk Date: July 27, 2010 Time: 11:57:49 AM Happened again. The thing that caused it seems to have been an autocommit. Here's part of the first thread dump: at java.lang.Object.wait(Native Method) - waiting on <0x29642f00> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked <0x29642f00> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) "25615...@qtp-20051738-9 - Acceptor0 socketconnec...@0.0.0.0:8983" prio=6 tid=0x 03076800 nid=0x19ac runnable [0x034df000] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390) - locked <0x29770448> (a java.net.SocksSocketImpl) at java.net.ServerSocket.implAccept(ServerSocket.java:453) at java.net.ServerSocket.accept(ServerSocket.java:421) at org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99) at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.ja va:707) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:582) "11245...@qtp-20051738-8" prio=6 tid=0x03075000 nid=0x10b4 waiting on condition [0x0348e000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2970bad8> (a java.util.concurrent.locks.Reentr antReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInt errupt(AbstractQueuedSynchronizer.java:747) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared (AbstractQueuedSynchronizer.java:877) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(A bstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(Reent rantReadWriteLock.java:594) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandle r2.java:211) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpd ateProcessorFactory.java:61) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(Ext ractingDocumentLoader.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(Ex tractingDocumentLoader.java:125) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr actingDocumentLoader.java:195) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co ntentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl erBase.java:131) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle Request(RequestHandlers.java:237) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter .java:337) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte r.java:240) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet Handler.java:1157) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3 88) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav a:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1 82) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7 65) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand lerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection. java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1 52) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:54 2) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpCo nnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector. java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:582) "25377...@qtp-20051738-7" prio=6 tid=0x03073800 nid=0x11c0 in Object.wait() [0x0 343e000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x2c802ee0> (a org.apache.solr.update.SolrIndexWriter) at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4270) - locked <0x2c802ee0> (a org.apache.solr.update.SolrIndexWriter) at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:27 81) - locked <0x2c802ee0> (a org.apache.solr.update.SolrIndexWriter) at org.apache.lucene.index.IndexWriter.finishMerges(IndexWriter.java:276 9) - locked <0x2c802ee0> (a org.apache.solr.update.SolrIndexWriter) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:17 42) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1695) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1659) at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:231 ) at org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateH andler2.java:181) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandle r2.java:409) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(Run UpdateProcessorFactory.java:85) at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandl erUtils.java:107) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co ntentStreamHandlerBase.java:48) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl erBase.java:131) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle Request(RequestHandlers.java:237) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter .java:337) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte r.java:240) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet Handler.java:1157) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3 88) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav a:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1 82) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7 65) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand lerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection. java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1 52) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:54 2) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpCo nnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector. java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:582) "15232...@qtp-20051738-6" prio=6 tid=0x0309ac00 nid=0x1458 in Object.wait() [0x0 33ef000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ece00> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ece00> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "381...@qtp-20051738-5" prio=6 tid=0x03099800 nid=0x19ec in Object.wait() [0x033 9f000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ece70> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ece70> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "2583...@qtp-20051738-4" prio=6 tid=0x03095800 nid=0x18dc in Object.wait() [0x03 34f000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ecee0> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ecee0> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "31289...@qtp-20051738-3" prio=6 tid=0x030a2800 nid=0x10c4 in Object.wait() [0x0 32ff000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ecf50> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ecf50> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "9004...@qtp-20051738-2" prio=6 tid=0x03077400 nid=0x1a38 in Object.wait() [0x03 2af000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ecfc0> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ecfc0> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "30971...@qtp-20051738-1" prio=6 tid=0x0308f800 nid=0x1ae0 in Object.wait() [0x0 325f000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ed030> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ed030> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "17758...@qtp-20051738-0" prio=6 tid=0x03094800 nid=0x1d58 in Object.wait() [0x0 320f000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ed0a0> (a org.mortbay.thread.QueuedThreadPool$PoolThr ead) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:625) - locked <0x295ed0a0> (a org.mortbay.thread.QueuedThreadPool$PoolThread) "Timer-0" daemon prio=6 tid=0x0308a400 nid=0x1d68 in Object.wait() [0x031bf000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295ed180> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked <0x295ed180> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) "Low Memory Detector" daemon prio=6 tid=0x02c1d400 nid=0x122c runnable [0x000000 00] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x02c17400 nid=0x1f94 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "Attach Listener" daemon prio=10 tid=0x02c15c00 nid=0x11b8 runnable [0x00000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x02c14800 nid=0xf38 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=8 tid=0x02c0cc00 nid=0x1408 in Object.wait() [0x02ddf000 ] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x29590468> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked <0x29590468> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) "Reference Handler" daemon prio=10 tid=0x02c0b400 nid=0x1614 in Object.wait() [0 x02d8f000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x295904f0> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0x295904f0> (a java.lang.ref.Reference$Lock) "VM Thread" prio=10 tid=0x02c09c00 nid=0x1cc8 runnable "VM Periodic Task Thread" prio=10 tid=0x02c20000 nid=0x90c waiting on condition JNI global references: 1236 Heap def new generation total 44992K, used 20928K [0x24040000, 0x27110000, 0x29590 000) eden space 40000K, 52% used [0x24040000, 0x254b0278, 0x26750000) from space 4992K, 0% used [0x26c30000, 0x26c30038, 0x27110000) to space 4992K, 0% used [0x26750000, 0x26750000, 0x26c30000) tenured generation total 99780K, used 74989K [0x29590000, 0x2f701000, 0x34040 000) the space 99780K, 75% used [0x29590000, 0x2decb7a8, 0x2decb800, 0x2f701000) compacting perm gen total 12288K, used 8833K [0x34040000, 0x34c40000, 0x380400 00) the space 12288K, 71% used [0x34040000, 0x348e0758, 0x348e0800, 0x34c40000) ro space 10240K, 54% used [0x38040000, 0x385bb760, 0x385bb800, 0x38a40000) rw space 12288K, 55% used [0x38a40000, 0x390df798, 0x390df800, 0x39640000) Here's the top part of a second thread dump (it doesn't all fit in one screen buffer, unfortunately): JNI global references: 1236 Heap def new generation total 44992K, used 20928K [0x24040000, 0x27110000, 0x29590 000) eden space 40000K, 52% used [0x24040000, 0x254b0278, 0x26750000) from space 4992K, 0% used [0x26c30000, 0x26c30038, 0x27110000) to space 4992K, 0% used [0x26750000, 0x26750000, 0x26c30000) tenured generation total 99780K, used 74989K [0x29590000, 0x2f701000, 0x34040 000) the space 99780K, 75% used [0x29590000, 0x2decb7a8, 0x2decb800, 0x2f701000) compacting perm gen total 12288K, used 8833K [0x34040000, 0x34c40000, 0x380400 00) the space 12288K, 71% used [0x34040000, 0x348e0758, 0x348e0800, 0x34c40000) ro space 10240K, 54% used [0x38040000, 0x385bb760, 0x385bb800, 0x38a40000) rw space 12288K, 55% used [0x38a40000, 0x390df798, 0x390df800, 0x39640000) 2010-07-27 11:55:19 Full thread dump Java HotSpot(TM) Client VM (16.2-b04 mixed mode, sharing): "Lucene Merge Thread #0" daemon prio=6 tid=0x02ffc400 nid=0xba0 runnable [0x042e f000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileChannelImpl.size0(Native Method) at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:309) - locked <0x2de616d8> (a java.lang.Object) at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:597) at org.apache.lucene.store.FSDirectory.copy(FSDirectory.java:476) at org.apache.lucene.store.FSDirectory$FSIndexOutput.copyBytes(FSDirecto ry.java:513) at org.apache.lucene.index.FieldsWriter.addRawDocuments(FieldsWriter.jav a:210) at org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(Segment Merger.java:392) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java: 329) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4024 ) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3618) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMe rgeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc urrentMergeScheduler.java:407) "DestroyJavaVM" prio=6 tid=0x003b6800 nid=0x1e34 waiting on condition [0x0000000 0] java.lang.Thread.State: RUNNABLE "Timer-2" daemon prio=6 tid=0x0363b400 nid=0x1e2c in Object.wait() [0x03a9f000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x297703f0> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked <0x297703f0> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) "pool-1-thread-1" prio=6 tid=0x02cb1400 nid=0x1760 waiting on condition [0x03a3f 000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x296acc88> (a java.util.concurrent.locks.Abstra ctQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject .await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.jav a:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.ja va:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:907) at java.lang.Thread.run(Thread.java:619) "Timer-1" daemon prio=6 tid=0x036de000 nid=0x1130 in Object.wait() [0x039df000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x29642f00> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked <0x29642f00> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) "25615...@qtp-20051738-9 - Acceptor0 socketconnec...@0.0.0.0:8983" prio=6 tid=0x 03076800 nid=0x19ac runnable [0x034df000] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390) - locked <0x29770448> (a java.net.SocksSocketImpl) at java.net.ServerSocket.implAccept(ServerSocket.java:453) at java.net.ServerSocket.accept(ServerSocket.java:421) at org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99) at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.ja va:707) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j ava:582) "11245...@qtp-20051738-8" prio=6 tid=0x03075000 nid=0x10b4 waiting on condition [0x0348e000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2970bad8> (a java.util.concurrent.locks.Reentr antReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInt errupt(AbstractQueuedSynchronizer.java:747) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared (AbstractQueuedSynchronizer.java:877) ... Karl -----Original Message----- From: Wright Karl (Nokia-MS/Cambridge) Sent: Tuesday, July 27, 2010 9:25 AM To: dev@lucene.apache.org Subject: busywait hang using extracting update handler on trunk Hi, While using extracting update request handler on trunk to index some 20 million records, about 1/3 theway through solr just hung, using 100% cpu. the same post code had been successfully used with the last release solr version, so this may be a regrssion. If it happens again I cantry to get a thread dump. Any thoughts? Karl