Robert Joseph Evans created STORM-376:
-----------------------------------------

             Summary: Add compression to data stored in ZK
                 Key: STORM-376
                 URL: https://issues.apache.org/jira/browse/STORM-376
             Project: Apache Storm (Incubating)
          Issue Type: Improvement
            Reporter: Robert Joseph Evans
            Assignee: Robert Joseph Evans


If you run zookeeper with -Dzookeeper.forceSync=no the zookeeper disk no longer 
is the bottleneck for scaling storm.  For us on a Gigabit Ethernet (scale test 
cluster) it becomes the aggregate reads by all of the supervisors and workers 
trying to download the compiled topology assignments.

To reduce this load we took two approaches.  First we compressed the data being 
stored in zookeeper (this JIRA) which also has the added benefit of increasing 
the size of the topology you can store in ZK.  Second we used the ZK version 
number to see if the data had changed and avoid downloading it again needlessly 
(STORM-375).

With these changes we were able to scale to a simulated 1965 nodes (5 
supervisors running on each of 393 real nodes, with each supervisor configured 
to have 10 slots).  We also filled the cluster with 131 topologies of 100 
workers each.   (we are going to 200 topos, and may try to scale the cluster 
even larger, but it takes forever to launch topologies once the cluster is 
under load.  We may try to address that shortly too)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to