Stephen O'Donnell created HDDS-5274:
---------------------------------------

             Summary: Revert HDDS-5153
                 Key: HDDS-5274
                 URL: https://issues.apache.org/jira/browse/HDDS-5274
             Project: Apache Ozone
          Issue Type: Bug
          Components: SCM
            Reporter: Stephen O'Donnell
            Assignee: Stephen O'Donnell


After some discussion with [~pifta] and [~swagle] we believe that the change in 
HDDS-5153 should be reverted.

If a DN starts decommissioning or maintenance, but goes dead before it 
completes the process, then the node is moved back to a state of IN_SERVICE and 
DEAD by the decommission monitor when it notices it has become dead. This is 
because decommission should gracefully remove the node, but it goes dead first, 
we may not be able to replicate its containers. In this case decommission 
effectively fails.

In HDDS-5153, we decided that if a node is already dead and you decommission 
it, it should immediately move to DECOMMISSIONED. However that is not really 
consistent with the above behaviour.

Also, there is no real value in decommissioning a dead node - it does not do 
anything except adjust its state in SCM.

To keep things consistent, I propose we revert HDDS-5153 so starting 
decommission on a dead node will work the same as when a node goes dead part 
way through decommission. In both cases the node will end up as IN_SERVICE + 
DEAD.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to