[ 
https://issues.apache.org/jira/browse/STORM-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074927#comment-14074927
 ] 

ASF GitHub Bot commented on STORM-376:
--------------------------------------

Github user revans2 commented on the pull request:

    https://github.com/apache/incubator-storm/pull/168#issuecomment-50205339
  
    I updated the code to make it configurable, but I am still a bit hesitant 
about this.  I looked at everywhere that Utils/serialize and Utils/deserialize 
are being used.  Most places seem fine because they are fairly transient, but 
it is also used for storing data to zookeeper for trident.  This makes me a bit 
nervous because the trident state is meant to live beyond a single topology, so 
it makes changing this a lot more difficult.


> Add compression to data stored in ZK
> ------------------------------------
>
>                 Key: STORM-376
>                 URL: https://issues.apache.org/jira/browse/STORM-376
>             Project: Apache Storm (Incubating)
>          Issue Type: Improvement
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>         Attachments: storm-2000.png
>
>
> If you run zookeeper with -Dzookeeper.forceSync=no the zookeeper disk no 
> longer is the bottleneck for scaling storm.  For us on a Gigabit Ethernet 
> (scale test cluster) it becomes the aggregate reads by all of the supervisors 
> and workers trying to download the compiled topology assignments.
> To reduce this load we took two approaches.  First we compressed the data 
> being stored in zookeeper (this JIRA) which also has the added benefit of 
> increasing the size of the topology you can store in ZK.  Second we used the 
> ZK version number to see if the data had changed and avoid downloading it 
> again needlessly (STORM-375).
> With these changes we were able to scale to a simulated 1965 nodes (5 
> supervisors running on each of 393 real nodes, with each supervisor 
> configured to have 10 slots).  We also filled the cluster with 131 topologies 
> of 100 workers each.   (we are going to 200 topos, and may try to scale the 
> cluster even larger, but it takes forever to launch topologies once the 
> cluster is under load.  We may try to address that shortly too)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to