Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2433
  
    @HeartSaVioR I think I must have mixed this patch up with another patch so 
yes I am wrong if this goes in backwards compatibility is in trouble.
    
    @danny0405 
    I have been looking through the code and this is a rather large change, 
could you explain to me at a high level how the architecture is changing?  I 
see that there is a SupervisorClient now. 
    
    It looks like the worker tries to talk to the supervisor through this 
client sending heartbeats and asking for its current assignment.  Also the 
worker can fall back to talking to nimbus for the heartbeats.
    
    I also see that nimbus will use it to try and send out new assignments to 
the Supervisors as well.
    
    Why do we heartbeat to the supervisor in 2 different ways?  We still go 
through the local file system, but also through thrift with a fallback to 
nimbus if it fails.  The implementation of this thrift heartbeat handler in the 
Supervisor is a noop.  Is the plan to eventually stop going through the file 
system?
    
    My biggest concern is around security.  In the current security model the 
workers have no way to communicate with Nimbus, or for that matter the new 
Supervisor Thrift Server.  Nimbus/Supervisor only supports Kerberos 
authentication and topologies are not guaranteed to have kerberos credentials 
available to them.  For Zookeeper we use a username/password that is randomly 
generated per topology to protect access.  When going through the file system 
to the supervisor we explicitly setup the file system permissions to only allow 
the owner of the topology + the supervisor user to be able to read and write to 
the heartbeat directory.  This does none of those.  The new Thrift APIs do 
nothing to restrict who has access to the them, so anyone can download 
assignments and anyone can heartbeat in for anyone else.  I can think of a few 
malicious things that an attacker could do with this power. Even if we do have 
proper authorization if I turn on Kerberos authentication those communication c
 hannels are now shut down to the worker because it cannot authenticate through 
Kerberos and hence cannot get past SASL to do anything.
    
    I am also concerned about Storm on Mesos with this patch.  We have broken 
them a lot of times because we don't think about them enough, and this is going 
to be another one of those cases.  Each supervisor must be on a given port for 
the cluster to work in this setup.  On Mesos and probably on Yarn too, there is 
the possibility of multiple supervisors running on a single physical host.  
Storm on mesos already jumps through all kinds of hoops to make this work by 
resetting the directories where the supervisors store things to avoid any 
collisions. With a port that cannot change because nimbus does not support 
talking to it on an ephemeral port I don't see a way to make 2 supervisors 
happy on the same box.
    
    Now please don't think I am beating on this patch because I would much 
rather have pacemaker.  I think the overall direction that this patch is going 
in is much much cleaner than what we are doing with pacemaker/zookeeper.  There 
is just potentially a lot of work to really get this working in a secure and 
compatible way.  We would have to support something like hadoop delegation 
tokens in storm so the workers could talk to nimbus and we could guarantee in 
all cases that they would be there an up to date.  We would also have to have a 
way for the supervisors to support something similar.
    
    But I would encourage you to take another look at pacemaker in the 
short-term, at least until we have all of these issues worked out.  We 
currently have clusters running using pacemaker with 900 supervisors and over 
120 topologies in a secure and stable way.  We did run into stability issues 
with pacemaker at the beginning, but we believe that we have fixed all of them. 
 If you are having crashes please let us know and we will do our best to fix 
them.  We also are currently experiencing issues with the time it takes to 
download heartbeats.  We have a plan to parallelize it, but just have not put 
it in place yet.
    
    Another thing to think about is that we are likely going to be able to 
remove the metrics from the ZK heartbeats once we have an alternative path for 
them.  This should let us reduce the load on ZK/pacemaker massively and let us 
scale even better.


---

Reply via email to