Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1574 @HeartSaVioR I am OK with this change, like I am OK with the change for STORM-1976. I just don't think that this is the final solution for the local blobstore+nimbus, nor do I think that either is a blocker. The reality is if I use HDFS as the backing for the blobstore and I only set it to have a single replica, then I lose a datanode, nimbus will still crash in almost exactly the same way. Is this a bug in nimbus? Is it a bug in the blobstore or HDFS? I would say no. It is user error because the user is trying to make something HA without configuring it properly. Then when an error happens we cannot recover. So the question is, if it does happen is it better for nimbus to crash or is it better for nimbus to hang? Because all this does is it switches from one to the other. I can see advantages for hanging over crashing, so I am OK with this fix. Does the blobstore code have bugs in it and things that we can change to make it work better? I would expect it to, it is software after all. I just want us to spend some time thinking about how we really want it to behave and fix it properly. If that proper fix comes after making a few patches to improve things, that is fine.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---