[
https://issues.apache.org/jira/browse/SOLR-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188961#comment-16188961
]
Erick Erickson commented on SOLR-11427:
---------------------------------------
bq: So shouldn't the user know that he is deleting an active replica?
Maybe, maybe not. The parent JIRA outlines all the things that are screwed up
with how state is reported. The graph view of the cluster state shows the node
as down. The state.json znode shows replicas on a missing node as "active" if
the node was killed via, say, "kill -9". CLUSTERSTATUS reports it as "down".
Then there's "gone"...
IIRC, at one point DELETEREPLICA failed if it couldn't connect to the Solr node
that had the replica that was missing. So if you forcibly killed a Solr
instance (or pulled the plug) about the only way to clean up ZK was to
hand-edit clusterstate.json (yes, a long time ago)..
onlyIfDown was put in as a safety valve when users wanted to be cautious
(perhaps when scripting) and did _not_ want to delete active replicas (through
perhaps a typo, bad scripting, whatever) but did want a way to clean up ZK.
Then there was the whole bit about how to delete a replica if it it was on a
node that had been shut down and when the DELETEREPLICA command was issued and
came back up (legacyCloud mode where the replica would recreate itself).
> DELETEREPLICA with onlyIfDown specified should succeed if the host node is
> not present in the live_nodes Znode
> --------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-11427
> URL: https://issues.apache.org/jira/browse/SOLR-11427
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Erick Erickson
> Assignee: Erick Erickson
>
> The title says it pretty much, so opening up for discussion:
> Here's the problem. Let's say a node is killed via {{kill -9}}. The
> state.json file still says it's "active", but the node is gone from
> live_nodes. If the node in question never comes back, the replica's state
> doesn't necessarily get switched to "down", so specifying onlyIfDown fails
> with "node is active" message. This is all documented more thoroughly in
> SOLR-9361.
> The question is whether it's sufficient and/or safe to succeed in deleting
> the replica from state.json if the state is "active" _and_ the node is NOT
> present in live_nodes.
> I'm assigning to myself, but others should feel free to take it.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]