[
https://issues.apache.org/jira/browse/SOLR-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Houston Putman resolved SOLR-17049.
-----------------------------------
Fix Version/s: 9.7
9.6.1
Assignee: Houston Putman
Resolution: Fixed
> Marking replicas down at startup and waiting does not wait
> ----------------------------------------------------------
>
> Key: SOLR-17049
> URL: https://issues.apache.org/jira/browse/SOLR-17049
> Project: Solr
> Issue Type: Bug
> Affects Versions: 8.6
> Reporter: Vincent Primault
> Assignee: Houston Putman
> Priority: Major
> Fix For: 9.7, 9.6.1
>
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> We observed an unexpected behaviour where a node was taking traffic for a
> replica that was not ready to take it. It seems to happen when the node is
> marked as live and the replica is marked as active, while the corresponding
> core is not loaded yet on the node.
>
> I looked at the code and in theory it should not happen, since the following
> happens in {{{}ZkController#init{}}}: mark node as down, wait for replicas to
> be marked as down, and then register the node as live. However, after looking
> at the code of {{{}publishAndWaitForDownStates{}}}, I observed that we wait
> for down states for replicas associated with cores as returned by
> {{{}CoreContainer#getCoreDescriptors{}}}... which is empty at this point
> since {{ZkController#init}} is called before cores are discovered (which
> happens later in {{{}CoreContainer#load{}}}).
>
> It hence seems to me that we basically never wait for any replicas to be
> marked as down, and continue the startup sequence by marking the node as
> live, and hence _might_ take traffic for a short period of time for a replica
> that is not ready (e.g., if the node previously crashed and the replica
> stayed active).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]