Jaydeepkumar Chovatia created CASSANDRA-18555:
-------------------------------------------------
Summary: A new nodetool/JMX command that tells whether node's
decommission failed or not
Key: CASSANDRA-18555
URL: https://issues.apache.org/jira/browse/CASSANDRA-18555
Project: Cassandra
Issue Type: Task
Components: Observability/JMX
Reporter: Jaydeepkumar Chovatia
Currently, when a node is being decommissioned and if any failure happens, then
an exception is thrown back to the caller.
But Cassandra's decommission takes considerable time ranging from minutes to
hours to days. There are various scenarios in that the caller may need to probe
the status again:
* The caller times out
* It is not possible to keep the caller hanging for such a long time
And If the caller does not know what happened internally, then it cannot retry,
etc., leading to other issues.
So, in this ticket, I am going to add a new nodetool/JMX command that can be
invoked by the caller anytime, and it will return the correct status.
It might look like a smaller change, but when we need to operate Cassandra at
scale in a large-scale fleet, then this becomes a bottleneck and require
constant operator intervention.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]