sodonnel opened a new pull request #2057:
URL: https://github.com/apache/ozone/pull/2057


   ## What changes were proposed in this pull request?
   
   With the current decommission / recommission / maintenance mode commands, 
you can pass a list of hosts to perform the operation on. If any of these hosts 
fail to enter the decommission / maintenance workflow, the command gives no 
feedback about the error. Some of the hosts can silently fail and the only way 
to know is to inspect the SCM log.
   
   The most common way a host can fail, is if a node which is undergoing 
maintenance is instructed to go to decommission and vice versa as this is a 
transition which is not allowed.
   
   This change will allow any failed nodes to feed back to the client. If the 
client detects that any of the nodes have failed, details will be written to 
stderr and the command exit code will be non-zero.
   
   Note that even though the exit code is non-zero, the command may have 
partially worked.
   
   Also note that the errors which are fed back are only around transitioning 
the node into the admin workflow - it is still possible for it to fail later 
for other reasons which will not be fed back to the client. This is because the 
client does not wait for the process to complete, but exits after confirmation 
the command has been processed by scm.
   
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4989
   
   ## How was this patch tested?
   
   New unit tests and validated manually using docker
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to