avijayanhwx opened a new pull request #2286: URL: https://github.com/apache/ozone/pull/2286
## What changes were proposed in this pull request? Recon relies on DN heartbeats to understand Node operational state. If the node goes down before it reports itself as DECOMMISSIONED, then there is a loss of information on Recon side. It is the SCM that moves a node form DECOMMISSIONING to DECOMMISSIONED state first, then the Datanode persists the state change locally, and then heartbeats with the new state to SCM & Recon subsequently. If the DN is shutdown before it can heartbeat the state change to Recon, then Recon lives with the stale information. This patch adds a DeadNodeHandler for Recon that updates the node operational status information from SCM if needed. It also adds a step in the Recon SCM Sync Task that updates the node operational status from SCM for every dead node as seen from Recon. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-5277 ## How was this patch tested? Manually tested using ozone decommissioning CLI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
