[ 
https://issues.apache.org/jira/browse/SOLR-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529094#comment-13529094
 ] 

Mark Miller commented on SOLR-4161:
-----------------------------------

Okay, sorry - the commit above solved a different issue. This issue still 
exists.
                
> deadlock if commit+newSearcher occurs during core close, can happen as a 
> result of snappuller (occured in TestReplicationHandler)
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4161
>                 URL: https://issues.apache.org/jira/browse/SOLR-4161
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>         Attachments: dump2.txt, dump3.txt, dump4.txt, dump5.txt
>
>
> There appears to be a lock related bug in the 
> DefaultSolrCoreState/DirectUpdateHandler2 interactions. It appears that if 
> CoreContainer is shutting down the core at the same time that some other 
> thread attempts to do a commit which triggers a newSearcher, then 
> DefaultSolrCoreState.closeIndexWriter and DefaultSolrCoreState.getIndexWriter 
> get into deadlock.
> This has been observed in TestReplicationHandler, but doesn't appear to be 
> related to any bugs in thta testcase, so it seems like it could easily affect 
> real life users as well.
> Summary of the deadlock stacks, see attachments for full details...
> {noformat}
> Found one Java-level deadlock:
> =============================
> "snapPuller-422-thread-1":
>   waiting to lock monitor 0x00007f5a2011a9e0 (object 0x00000000f5f485a0, a 
> org.apache.solr.update.DefaultSolrCoreState),
>   which is held by "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]"
> "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]":
>   waiting for ownable synchronizer 0x00000000f60fe5c8, (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync),
>   which is held by "snapPuller-422-thread-1"
> Java stack information for the threads listed above:
> ===================================================
> "snapPuller-422-thread-1":
>       at 
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:77)
>       - waiting to lock <0x00000000f5f485a0> (a 
> org.apache.solr.update.DefaultSolrCoreState)
>       at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1358)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:561)
>       - locked <0x00000000f5f485d0> (a java.lang.Object)
>       at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:655)
>       at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:454)
>       at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:273)
>       at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:222)
> ...
> "TEST-TestReplicationHandler.test-seed#[1B46F52130C14E03]":
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000f60fe5c8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201)
>       at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
>       at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:668)
>       at 
> org.apache.solr.update.DefaultSolrCoreState.closeIndexWriter(DefaultSolrCoreState.java:64)
>       - locked <0x00000000f5f485a0> (a 
> org.apache.solr.update.DefaultSolrCoreState)
>       at 
> org.apache.solr.update.DefaultSolrCoreState.close(DefaultSolrCoreState.java:259)
>       - locked <0x00000000f5f485a0> (a 
> org.apache.solr.update.DefaultSolrCoreState)
>       at org.apache.solr.core.SolrCore.decrefSolrCoreState(SolrCore.java:879)
>       - locked <0x00000000f5f485a0> (a 
> org.apache.solr.update.DefaultSolrCoreState)
>       at org.apache.solr.core.SolrCore.close(SolrCore.java:971)
>       at org.apache.solr.core.CoreContainer.shutdown(CoreContainer.java:723)
> {noformat}
> Original Report...
> {quote}
> while testing out another patch i noticed "stalled" heartbeat messages 
> getting logged by TestReplicationHandler.test and started taking some stack 
> traces to see if it was in the code i was working on.
> it's not, so i suspect it's unrelated to the changes i'm looking at, but i 
> did notice that there was a full on deadlock reported, so i wanted to make 
> sure it got tracked.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to