Megan Carey created SOLR-15386:
----------------------------------
Summary: Internal DOWNNODE request will mark replicas down even if
their host node is now live
Key: SOLR-15386
URL: https://issues.apache.org/jira/browse/SOLR-15386
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: SolrCloud
Affects Versions: 8.6
Reporter: Megan Carey
When a node is shutting down, it calls into:
#
[CoreContainer.shutdown()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L1026]
#
[ZkController.preClose()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L612]
#
[ZkController.publishNodeAsDown|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L2753]
This sends a request to Overseer to mark all of the replicas DOWN for the
soon-to-be down node.
#
[Overseer.processMessage()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/Overseer.java#L459]
#
[NodeMutator.downNode()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/overseer/NodeMutator.java#L48]
The issue we encountered was as follows:
# Solr node shuts down
# DOWNNODE message is enqueued for Overseer
# Solr node comes back up (running on K8s, so a new node is auto-started as
soon as the old node was detected as down)
# DOWNNODE was dequeued for processing, and marked all replicas DOWN for the
node that is now live.
The only place where these replicas would later be marked ACTIVE again is after
ShardLeaderElection, but we did not reach that case. An easy fix is to add a
check for node liveness prior to marking replicas down, but a lot of tests fail
with this change. Was this the intended functionality?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]