avijayanhwx opened a new pull request #2286:
URL: https://github.com/apache/ozone/pull/2286


   ## What changes were proposed in this pull request?
   Recon relies on DN heartbeats to understand Node operational state. If the 
node goes down before it reports itself as DECOMMISSIONED, then there is a loss 
of information on Recon side. It is the SCM that moves a node form 
DECOMMISSIONING to DECOMMISSIONED state first, then the Datanode persists the 
state change locally, and then heartbeats with the new state to SCM & Recon 
subsequently. If the DN is shutdown before it can heartbeat the state change to 
Recon, then Recon lives with the stale information.
   
   This patch adds a DeadNodeHandler for Recon that updates the node 
operational status information from SCM if needed. It also adds a step in the 
Recon SCM Sync Task that updates the node operational status from SCM for every 
dead node as seen from Recon. 
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-5277
   
   ## How was this patch tested?
   Manually tested using ozone decommissioning CLI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to