Aloha Developers, I was recently upgrading a cluster of 100 nodes using Ambari's rolling upgrade. During upgrade when almost half of the nodes were upgraded an upgrade task failed on a particular host due to hard disk failure. In this case Ambari web UI prompted me to either downgrade or to retry. In this case retry doesn't seems to be a good option so is downgrade because almost half of the nodes were upgraded. What could be a better workaround in this case is, If Ambari could provide me with an option of ignore and proceed as it provides for service checks, and then stop the service that failed. We can manage a loss of one host in such a large cluster and can add that host later with latest bits once the remaining cluster is upgraded.
Has anyone faced a similar issues in past, If yes then please tell me any probable workaround as of today. Also is there any issue on JIRA regarding the development of this feature (of ignoring and proceeding) or feature similar to this, because I was unable to find one. Please let me know your thoughts on this, because if this problem is being faced by a lot of orgs then we can create a JIRA issue and then start development for the same. Mahalo, Abhey Rana CSE Undergraduate | NIT Allahabad site: abhey.github.io email: [email protected] address: Tilak Hostel, MNNIT Allahabad <https://github.com/Abhey>[image: https://www.facebook.com/Abhey.Rana.Useless] <https://www.facebook.com/Abhey.Rana.Useless> [image: Mailtrack] <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> Sender notified by Mailtrack <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> 07/06/18, 12:29:31
