[
https://issues.apache.org/jira/browse/SOLR-9470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477568#comment-15477568
]
Michael Braun commented on SOLR-9470:
-------------------------------------
Dug more into this and only two threads are actually part of the core deadlock
-
"recoveryExecutor-3-thread-1-processing-n:x.x.x.166:8983_solr
x:mycollection_shard1_replica2 s:shard1 c:mycollection r:core_node97":
{code}
- parking to wait for <0x00007fc1b0a97250> (a
java.util.concurrent.locks.ReentrantLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at
java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1804)
at
org.apache.solr.handler.IndexFetcher.openNewSearcherAndUpdateCommitPoint(IndexFetcher.java:746)
at
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:523)
at
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:254)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
{code}
It first acquires the iwLock ( 0x00007fc1b0a96fe0) by this mechanism:
org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java
210)
org.apache.solr.update.DirectUpdateHandler2.newIndexWriter(DirectUpdateHandler2.java
698)
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java 520)
Then as you see from the stacktrace above, it's waiting on the
openSearcherLock, which is held by the thread below:
"qtp1879034789-189":
{code}
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007fc1b0a96fe0> (a
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
at
org.apache.solr.update.DefaultSolrCoreState.lock(DefaultSolrCoreState.java:159)
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:104)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1601)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1806)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1552)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1487)
at
org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:115)
at
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:130)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154)
{code}
It's already holding the openSearchLock (0x00007fc1b0a97250) and wants the
iwLock. It gets the openSearchLock by this mechanism:
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1804) is
where it does the actual lock of openSearcher.lock, called by....
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1552)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1487)
at
org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:115)
at
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:130)
> Deadlocked threads in recovery
> ------------------------------
>
> Key: SOLR-9470
> URL: https://issues.apache.org/jira/browse/SOLR-9470
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 6.2
> Reporter: Michael Braun
> Attachments: solr-deadlock.txt
>
>
> Background: Booted up a cluster and replicas were in recovery. All replicas
> recovered minus one, and it was hanging on HTTP requests. Issued shutdown and
> solr would not shut down. Examined with JStack and found a deadlock had
> occurred. The relevant thread information is attached. Some information has
> been redacted as well (some custom URPs, IPs) from the stack traces.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]