[
https://issues.apache.org/jira/browse/STORM-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052518#comment-14052518
]
Radim Kolar commented on STORM-339:
-----------------------------------
There are 3 methods for implementing protection against OOM without need to
acknowledge every message. Storm in ack mode has 10x lower throughput.
See end of
http://docs.jboss.org/hornetq/2.2.5.Final/user-manual/en/html/queue-attributes.html#queue-attributes.address-settings
1) use ring buffer for receiving messages. If messages are processed too slowly
newly arriving message will replace older unprocessed message. This is not a
flow control - just protection against OOM. (type DROP)
2) implement flow control messages, something simple like XON/XOFF protocol
(http://en.wikipedia.org/wiki/Software_flow_control) should suffice (type BLOCK)
3) save messages to disk instead of throwing them away (type PAGE)
for inspiration see
http://docs.jboss.org/hornetq/2.2.5.Final/user-manual/en/html/flow-control.html
> Severe memory leak to OOM when ackers disabled
> ----------------------------------------------
>
> Key: STORM-339
> URL: https://issues.apache.org/jira/browse/STORM-339
> Project: Apache Storm (Incubating)
> Issue Type: Bug
> Affects Versions: 0.9.2-incubating
> Reporter: Jiahong Li
>
> Without any ackers enabled, fast component will continuously leak memory and
> causing OOM problems when target component is slow. The OOM problem can be
> reproduced by running this fast-slow-topology:
> https://github.com/Gvain/storm-perf-test/tree/fast-slow-topology
> with command:
> {code}
> $ storm jar storm_perf_test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
> com.yahoo.storm.perftest.Main --spout 1 --bolt 1 --workers 2 --testTime 600
> --messageSize 6400
> {code}
> And the worker childopts with {{-Xms2g -Xmx2g -Xmn512m ...}}.
> At the same time, the executed count of target component is far behind from
> the emitted count of source component. I guess it could be that netty client
> is buffering too much messages in its message_queue as target component sends
> back OK/Failure Response too slowly.
--
This message was sent by Atlassian JIRA
(v6.2#6252)