[
https://issues.apache.org/jira/browse/STORM-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046262#comment-14046262
]
ASF GitHub Bot commented on STORM-376:
--------------------------------------
GitHub user revans2 opened a pull request:
https://github.com/apache/incubator-storm/pull/168
STORM-376: Add compression to serialization.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/revans2/incubator-storm compression
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-storm/pull/168.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #168
----
commit 041b11c1966e3bb0b29ba31d366315b8aced19d7
Author: Robert (Bobby) Evans <[email protected]>
Date: 2014-06-27T18:48:30Z
Add compression to serialization.
----
> Add compression to data stored in ZK
> ------------------------------------
>
> Key: STORM-376
> URL: https://issues.apache.org/jira/browse/STORM-376
> Project: Apache Storm (Incubating)
> Issue Type: Improvement
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
>
> If you run zookeeper with -Dzookeeper.forceSync=no the zookeeper disk no
> longer is the bottleneck for scaling storm. For us on a Gigabit Ethernet
> (scale test cluster) it becomes the aggregate reads by all of the supervisors
> and workers trying to download the compiled topology assignments.
> To reduce this load we took two approaches. First we compressed the data
> being stored in zookeeper (this JIRA) which also has the added benefit of
> increasing the size of the topology you can store in ZK. Second we used the
> ZK version number to see if the data had changed and avoid downloading it
> again needlessly (STORM-375).
> With these changes we were able to scale to a simulated 1965 nodes (5
> supervisors running on each of 393 real nodes, with each supervisor
> configured to have 10 slots). We also filled the cluster with 131 topologies
> of 100 workers each. (we are going to 200 topos, and may try to scale the
> cluster even larger, but it takes forever to launch topologies once the
> cluster is under load. We may try to address that shortly too)
--
This message was sent by Atlassian JIRA
(v6.2#6252)