[ https://issues.apache.org/jira/browse/SOLR-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529094#comment-13529094 ]
Mark Miller commented on SOLR-4161: ----------------------------------- Okay, sorry - the commit above solved a different issue. This issue still exists. > deadlock if commit+newSearcher occurs during core close, can happen as a > result of snappuller (occured in TestReplicationHandler) > --------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-4161 > URL: https://issues.apache.org/jira/browse/SOLR-4161 > Project: Solr > Issue Type: Bug > Reporter: Hoss Man > Attachments: dump2.txt, dump3.txt, dump4.txt, dump5.txt > > > There appears to be a lock related bug in the > DefaultSolrCoreState/DirectUpdateHandler2 interactions. It appears that if > CoreContainer is shutting down the core at the same time that some other > thread attempts to do a commit which triggers a newSearcher, then > DefaultSolrCoreState.closeIndexWriter and DefaultSolrCoreState.getIndexWriter > get into deadlock. > This has been observed in TestReplicationHandler, but doesn't appear to be > related to any bugs in thta testcase, so it seems like it could easily affect > real life users as well. > Summary of the deadlock stacks, see attachments for full details... > {noformat} > Found one Java-level deadlock: > ============================= > "snapPuller-422-thread-1": > waiting to lock monitor 0x00007f5a2011a9e0 (object 0x00000000f5f485a0, a > org.apache.solr.update.DefaultSolrCoreState), > which is held by "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]" > "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]": > waiting for ownable synchronizer 0x00000000f60fe5c8, (a > java.util.concurrent.locks.ReentrantLock$NonfairSync), > which is held by "snapPuller-422-thread-1" > Java stack information for the threads listed above: > =================================================== > "snapPuller-422-thread-1": > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:77) > - waiting to lock <0x00000000f5f485a0> (a > org.apache.solr.update.DefaultSolrCoreState) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1358) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:561) > - locked <0x00000000f5f485d0> (a java.lang.Object) > at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:655) > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:454) > at > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:273) > at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:222) > ... > "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]": > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000f60fe5c8> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > at > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > at > org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:668) > at > org.apache.solr.update.DefaultSolrCoreState.closeIndexWriter(DefaultSolrCoreState.java:64) > - locked <0x00000000f5f485a0> (a > org.apache.solr.update.DefaultSolrCoreState) > at > org.apache.solr.update.DefaultSolrCoreState.close(DefaultSolrCoreState.java:259) > - locked <0x00000000f5f485a0> (a > org.apache.solr.update.DefaultSolrCoreState) > at org.apache.solr.core.SolrCore.decrefSolrCoreState(SolrCore.java:879) > - locked <0x00000000f5f485a0> (a > org.apache.solr.update.DefaultSolrCoreState) > at org.apache.solr.core.SolrCore.close(SolrCore.java:971) > at org.apache.solr.core.CoreContainer.shutdown(CoreContainer.java:723) > {noformat} > Original Report... > {quote} > while testing out another patch i noticed "stalled" heartbeat messages > getting logged by TestReplicationHandler.test and started taking some stack > traces to see if it was in the code i was working on. > it's not, so i suspect it's unrelated to the changes i'm looking at, but i > did notice that there was a full on deadlock reported, so i wanted to make > sure it got tracked. > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org