[ 
https://issues.apache.org/jira/browse/HDDS-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng resolved HDDS-5090.
-----------------------------
    Fix Version/s: 1.2.0
       Resolution: Fixed

> make Decommission work under SCM HA.
> ------------------------------------
>
>                 Key: HDDS-5090
>                 URL: https://issues.apache.org/jira/browse/HDDS-5090
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Glen Geng
>            Assignee: Glen Geng
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.2.0
>
>
> *The problem*
> The decommission/maintenance info is saved in memory of SCM, and if SCM is 
> restarted, it relearns this info during re-register of Datanode.
> Only leader SCM handles the decommissionNodes(), recommissionNodes(), 
> startMaintenanceNodes() request, and not replicate these info to follower 
> SCM, thus when failover happens, the new leader SCM will lose this info, 
> since they are saved in memory of previous leader SCM.
> *Current status*
>  If a SCM is restarted, then upon re-registration the datanode will already 
> be in DECOMMISSIONING or ENTERING_MAINTENANCE or IN_MAINTENANCE state. In 
> that case, it needs to be added back into the monitor to track its progress.
> For a registered node, the information stored in SCM is the source of truth. 
> If SCM finds that the opState or opStateExpiryEpoch is different from what it 
> saves in memory, it will send SetNodeOperationalStateCommand to update the 
> Datanode.
> *The solution*
> leader SCM --hb--> DN --hb--> follower SCM
> 1, Leader SCM updates PersistedOpState of Datanode via heartbeat. Datanode 
> update OpState in follower SCM via heartbeat.
> 2, When follower SCM becomes leader, it calls continueAdminForNode for all 
> datanode, so that the DECOMMISSIONING, ENTERING_MAINTENANCE, IN_MAINTENANCE 
> datanode will be added back to the monitor.
> *Disadvantage*
> The same as now, if leader SCM records the info, notifies Datanode via 
> heartbeat, but steps down before Datanode notifies follower SCM via 
> heartbeat, that info will be lost in the new leader SCM.
> As discussed with [~sodonnell], we can live with the rare event of a 
> decommission starting and SCM failing over before the state has made it to 
> the DNs.
>  
> For details: 
> https://docs.google.com/document/d/1N5PsUuLBGgvkYFQgDumvRZujc-9RcDwoE0SubZcLUzY/edit?usp=sharing
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to