Github user danny0405 commented on the issue:

    https://github.com/apache/storm/pull/2433
  
    @HeartSaVioR  @revans2
    1. Yeah, we did used pacemaker for a while for a cluster about 200 
topologies, but the workers restart frequently just because the Pacemaker 
heartbeats packets discard.
    2. Also the pacemaker is a single point for the cluster, there is even no 
HA for it, when pacemaker restart, it will take a long time to recover 
heartbeats for it[even it has a HA], then most of the workers will time out and 
be reassigned by master. I raise doubts about keeping heartbeats into just one 
single point, and it is hard to scale laterally. @revans2 said that your 
cluster has 900 supervisors but only 120 topologies which push heatbeats to 
pacemaker. So do we have a index/metics between pacemaker and 
workers/topologies while not supervisors ?
    3. This patch can really support large cluster, and it is very stable for 
out production, we have about 8000 topologies, and the new patch can support at 
least 2000 topologies at least for now. Also the patch has no single point 
problem compared to Pacemaker, and the heartbeat is very lightweight.
    4. If we just have security problem, i can fix it.



---

Reply via email to