[ 
https://issues.apache.org/jira/browse/SOLR-9470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Braun updated SOLR-9470:
--------------------------------
    Attachment: solr-deadlock-2-r.txt

Replicated again - redacted thread dumps attached for relevant threads. Also 
confirmed we see some of the same lines that were shown in the relevant 
[SOLR-9278] deadlock ticket, where the index files can't be deleted, as shown 
below:

{code}
09-22 17:24:42.317  - we started the process

09-22 17:25:43.716 org.apache.solr.handler.IndexFetcher 
(recoveryExecutor-3-thread-1-processing-n:x.x.x.75:8983_solr 
x:collection_shard1_replica1 s:shard1 c:collection) [s:shard1] IndexFetcher 
unable to cleanup unused lucene index files so we must do a full copy instead 
globalRequestId: 
09-22 17:25:43.716 org.apache.solr.handler.IndexFetcher 
(recoveryExecutor-3-thread-1-processing-n:x.x.x.75:8983_solr 
x:collection_shard1_replica1 s:shard1 c:collection) [s:shard1] IndexFetcher 
slept for 30000ms for unused lucene index files 
to be delete-able globalRequestId: 
INFO  09-22 17:25:43.864 org.apache.solr.update.DefaultSolrCoreState 
(recoveryExecutor-3-thread-1-processing-n:x.x.x.75:8983_solr 
x:collection_shard1_replica1 s:shard1 c:collection) [s:shard1] Rollback old 
IndexWriter... core=collection_shard1_replica1
 globalRequestId: 
 {code}

I'm hoping that the patch in SOLR-9278 is valid and would fix the problem?

> Deadlocked threads in recovery
> ------------------------------
>
>                 Key: SOLR-9470
>                 URL: https://issues.apache.org/jira/browse/SOLR-9470
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 6.2
>            Reporter: Michael Braun
>         Attachments: solr-deadlock-2-r.txt, solr-deadlock.txt
>
>
> Background: Booted up a cluster and replicas were in recovery. All replicas 
> recovered minus one, and it was hanging on HTTP requests. Issued shutdown and 
> solr would not shut down. Examined with JStack and found a deadlock had 
> occurred. The relevant thread information is attached. Some information has 
> been redacted as well (some custom URPs, IPs) from the stack traces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to