Hi,

I am wondering if it's there is any way for Aurora to kill the failed
instances when a job update is not successful (e.g. apps on some
backends
fail to start up etc.)?

Since right now, we turned off the "rollback" feature during the job
update, because of one or two backends (out of tens to hundreds
backends)
failing is acceptable for us, we don't want completely rollback the
whole fleet due to that. However, it seems like with "rollback" off,
those failed backends will just be left there, and they will try to
restart infinitely.

Just curious what would be a recommended approach for this situation?
Should we try to identify those instances and stop them in our own
deployment scripts?

Thanks,
Kaiwen

Reply via email to