[
https://issues.apache.org/jira/browse/STORM-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032672#comment-14032672
]
Robert Joseph Evans commented on STORM-339:
-------------------------------------------
This is not specific to the Netty transport, although it does not help because
it does buffer a lot more than the zeromq code did. If you don't have acking
enabled there is no flow control in storm, and if you have not properly sized
your components the tuples will be buffered in memory and eventually OOM or be
shot by the supervisor because GC took too long and the heartbeats stopped
coming.
I'm not sure that there is a really good way to fix this totally without acking.
> Severe memory leak to OOM when ackers disabled
> ----------------------------------------------
>
> Key: STORM-339
> URL: https://issues.apache.org/jira/browse/STORM-339
> Project: Apache Storm (Incubating)
> Issue Type: Bug
> Affects Versions: 0.9.2-incubating
> Reporter: Jiahong Li
> Priority: Critical
>
> Without any ackers enabled, fast component will continuously leak memory and
> causing OOM problems when target component is slow. The OOM problem can be
> reproduced by running this fast-slow-topology:
> https://github.com/Gvain/storm-perf-test/tree/fast-slow-topology
> with command:
> {code}
> $ storm jar storm_perf_test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
> com.yahoo.storm.perftest.Main --spout 1 --bolt 1 --workers 2 --testTime 600
> --messageSize 6400
> {code}
> And the worker childopts with {{-Xms2g -Xmx2g -Xmn512m ...}}.
> At the same time, the executed count of target component is far behind from
> the emitted count of source component. I guess it could be that netty client
> is buffering too much messages in its message_queue as target component sends
> back OK/Failure Response too slowly.
--
This message was sent by Atlassian JIRA
(v6.2#6252)