[
https://issues.apache.org/jira/browse/HDDS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen O'Donnell resolved HDDS-5274.
-------------------------------------
Fix Version/s: 1.2.0
Resolution: Fixed
> Revert HDDS-5153
> ----------------
>
> Key: HDDS-5274
> URL: https://issues.apache.org/jira/browse/HDDS-5274
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.2.0
>
>
> After some discussion with [~pifta] and [~swagle] we believe that the change
> in HDDS-5153 should be reverted.
> If a DN starts decommissioning or maintenance, but goes dead before it
> completes the process, then the node is moved back to a state of IN_SERVICE
> and DEAD by the decommission monitor when it notices it has become dead. This
> is because decommission should gracefully remove the node, but it goes dead
> first, we may not be able to replicate its containers. In this case
> decommission effectively fails.
> In HDDS-5153, we decided that if a node is already dead and you decommission
> it, it should immediately move to DECOMMISSIONED. However that is not really
> consistent with the above behaviour.
> Also, there is no real value in decommissioning a dead node - it does not do
> anything except adjust its state in SCM.
> To keep things consistent, I propose we revert HDDS-5153 so starting
> decommission on a dead node will work the same as when a node goes dead part
> way through decommission. In both cases the node will end up as IN_SERVICE +
> DEAD.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]