Hi Storm fellows, I've got a simple question and would like to have a quick answer.
Let's say a storm topology is running on a cluster without any supervision, at the beginning it is behaving properly and have a balanced distribution. But you know, sometimes errors may occur and bring down the supervisor daemon or even the whole machine. I am just wondering in such kind of situations what action storm will take to guarantee the fault resilience? E.g. when I use "Ctrl+C" to terminate the supervisor or manually shut down the machine to simulate those two different kinds of crash, I found that storm will automatically allocate the lost slots to another machine and just keeping running, is it a implicit invocation of rebalancing command? It is a transparent way to deal with supervisor error but what if I got the lost machine back after several minutes of downtime ? Is there any out of box method that could automatically rebalance the topology again and put the revival supervisor or machine back to work? Any answer to this will be greatly appreciated, Thanks. :-)
