When a worker process dies, Supervisor is responsible for restarting it. If the worker process continues to fail, then eventually Nimbus will try to reschedule the tasks to a different worker node. This is all explained here: https://storm.apache.org/releases/1.2.1/Daemon-Fault-Tolerance.html
Is there a limit to how often workers will be restarted? If there is a bug on startup that causes repeated crashes, at some point it doesn't make sense to try and keep restarting the worker. I saw that there is a configuration property called STORM_NIMBUS_RETRY_TIMES, but this doesn't seem to apply to restarting worker processes.
