[GitHub] storm issue #1574: STORM-1977 Restore logic: give up leadership when elected...

revans2 Mon, 18 Jul 2016 09:03:07 -0700

Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1574
  
    @HeartSaVioR 
    I am OK with this change, like I am OK with the change for STORM-1976.  I 
just don't think that this is the final solution for the local 
blobstore+nimbus, nor do I think that either is a blocker.   The reality is if 
I use HDFS as the backing for the blobstore and I only set it to have a single 
replica, then I lose a datanode, nimbus will still crash in almost exactly the 
same way.  Is this a bug in nimbus?  Is it a bug in the blobstore or HDFS? I 
would say no. It is user error because the user is trying to make something HA 
without configuring it properly. Then when an error happens we cannot recover.  
So the question is, if it does happen is it better for nimbus to crash or is it 
better for nimbus to hang?  Because all this does is it switches from one to 
the other.  I can see advantages for hanging over crashing, so I am OK with 
this fix.
    
    Does the blobstore code have bugs in it and things that we can change to 
make it work better? I would expect it to, it is software after all.  I just 
want us to spend some time thinking about how we really want it to behave and 
fix it properly.  If that proper fix comes after making a few patches to 
improve things, that is fine.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1574: STORM-1977 Restore logic: give up leadership when elected...

Reply via email to