[
https://issues.apache.org/jira/browse/SOLR-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man resolved SOLR-4161.
----------------------------
Resolution: Fixed
> deadlock if commit+newSearcher occurs during core close, can happen as a
> result of snappuller (occured in TestReplicationHandler)
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-4161
> URL: https://issues.apache.org/jira/browse/SOLR-4161
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Attachments: dump2.txt, dump3.txt, dump4.txt, dump5.txt
>
>
> There appears to be a lock related bug in the
> DefaultSolrCoreState/DirectUpdateHandler2 interactions. It appears that if
> CoreContainer is shutting down the core at the same time that some other
> thread attempts to do a commit which triggers a newSearcher, then
> DefaultSolrCoreState.closeIndexWriter and DefaultSolrCoreState.getIndexWriter
> get into deadlock.
> This has been observed in TestReplicationHandler, but doesn't appear to be
> related to any bugs in thta testcase, so it seems like it could easily affect
> real life users as well.
> Summary of the deadlock stacks, see attachments for full details...
> {noformat}
> Found one Java-level deadlock:
> =============================
> "snapPuller-422-thread-1":
> waiting to lock monitor 0x00007f5a2011a9e0 (object 0x00000000f5f485a0, a
> org.apache.solr.update.DefaultSolrCoreState),
> which is held by "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]"
> "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]":
> waiting for ownable synchronizer 0x00000000f60fe5c8, (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "snapPuller-422-thread-1"
> Java stack information for the threads listed above:
> ===================================================
> "snapPuller-422-thread-1":
> at
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:77)
> - waiting to lock <0x00000000f5f485a0> (a
> org.apache.solr.update.DefaultSolrCoreState)
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1358)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:561)
> - locked <0x00000000f5f485d0> (a java.lang.Object)
> at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:655)
> at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:454)
> at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:273)
> at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:222)
> ...
> "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000f60fe5c8> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201)
> at
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> at
> org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:668)
> at
> org.apache.solr.update.DefaultSolrCoreState.closeIndexWriter(DefaultSolrCoreState.java:64)
> - locked <0x00000000f5f485a0> (a
> org.apache.solr.update.DefaultSolrCoreState)
> at
> org.apache.solr.update.DefaultSolrCoreState.close(DefaultSolrCoreState.java:259)
> - locked <0x00000000f5f485a0> (a
> org.apache.solr.update.DefaultSolrCoreState)
> at org.apache.solr.core.SolrCore.decrefSolrCoreState(SolrCore.java:879)
> - locked <0x00000000f5f485a0> (a
> org.apache.solr.update.DefaultSolrCoreState)
> at org.apache.solr.core.SolrCore.close(SolrCore.java:971)
> at org.apache.solr.core.CoreContainer.shutdown(CoreContainer.java:723)
> {noformat}
> Original Report...
> {quote}
> while testing out another patch i noticed "stalled" heartbeat messages
> getting logged by TestReplicationHandler.test and started taking some stack
> traces to see if it was in the code i was working on.
> it's not, so i suspect it's unrelated to the changes i'm looking at, but i
> did notice that there was a full on deadlock reported, so i wanted to make
> sure it got tracked.
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]