[
https://issues.apache.org/jira/browse/SOLR-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shalin Shekhar Mangar resolved SOLR-10914.
------------------------------------------
Resolution: Fixed
Fix Version/s: 6.7
> RecoveryStrategy's sendPrepRecoveryCmd can get stuck for 5 minutes if leader
> is unloaded
> ----------------------------------------------------------------------------------------
>
> Key: SOLR-10914
> URL: https://issues.apache.org/jira/browse/SOLR-10914
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 6.4, 6.5, 6.6
> Reporter: Shalin Shekhar Mangar
> Assignee: Shalin Shekhar Mangar
> Fix For: master (7.0), 6.7
>
> Attachments: SOLR-10914.patch, SOLR-10914.patch, SOLR-10914.patch
>
>
> tl;dr; a recovering replica is stuck for 5 minutes in the prep recovery
> request if the leader core is unloaded before the prep recovery request is
> made.
> SOLR-9716 changed the sendPrepRecoveryCmd to retry on read timeouts (earlier
> it had no connection/read timeout at all) but the fix has caused another
> problem. Say
> # A replica starts up (or is newly created) and goes into recovery,
> # Replica finds that leader=X
> # The core X is unloaded but the node that used to host X is still running
> and taking requests
> # Replica calls sendPrepRecoveryCmd to X
> At this point, the node X receives the prep recovery command, finds that the
> core X does not exist and keeps checking again in a sleep-loop until a
> timeout happens. I am not sure why prep recovery core admin command needs to
> continue waiting if a local core does not exist. The default timeout here is
> usually longer than 10 seconds.
> On the recovering replica's side, the prep recovery has a connection/read
> timeout of only 10s, so the request always times out and is retried upto 5
> minutes. Only then does the recovery attempt fails and may be restarted again
> with the right leader URL.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]