Stephen O'Donnell created HDDS-5274:
---------------------------------------
Summary: Revert HDDS-5153
Key: HDDS-5274
URL: https://issues.apache.org/jira/browse/HDDS-5274
Project: Apache Ozone
Issue Type: Bug
Components: SCM
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell
After some discussion with [~pifta] and [~swagle] we believe that the change in
HDDS-5153 should be reverted.
If a DN starts decommissioning or maintenance, but goes dead before it
completes the process, then the node is moved back to a state of IN_SERVICE and
DEAD by the decommission monitor when it notices it has become dead. This is
because decommission should gracefully remove the node, but it goes dead first,
we may not be able to replicate its containers. In this case decommission
effectively fails.
In HDDS-5153, we decided that if a node is already dead and you decommission
it, it should immediately move to DECOMMISSIONED. However that is not really
consistent with the above behaviour.
Also, there is no real value in decommissioning a dead node - it does not do
anything except adjust its state in SCM.
To keep things consistent, I propose we revert HDDS-5153 so starting
decommission on a dead node will work the same as when a node goes dead part
way through decommission. In both cases the node will end up as IN_SERVICE +
DEAD.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]