[ https://issues.apache.org/jira/browse/STORM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396242#comment-14396242 ]
ASF GitHub Bot commented on STORM-756: -------------------------------------- GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/505 [STORM-756] ShellBolt can delay sending tuples on demand from subprocess Please refer https://issues.apache.org/jira/browse/STORM-756 to see more details. * introduce new multilang protocol: delay [seconds] ** command: "delay", msg: "<seconds>" ** python: storm.delay(seconds) ** ruby: Storm::Protocol.delay(seconds) * expose pending queue size to let users check and request delay ** python: storm.getPendingQueueSize() ** ruby: Storm::Protocol.get_pending_queue_size() You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-756 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/505.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #505 ---- commit 7f0dda034bdfb260b2c2ef49fec440bb59d2360f Author: Jungtaek Lim <kabh...@gmail.com> Date: 2015-04-05T14:02:13Z ShellBolt can delay sending tuples on demand from subprocess * introduce new multilang protocol: delay [seconds] ** command: "delay", msg: "<seconds>" ** python: storm.delay(seconds) ** ruby: Storm::Protocol.delay(seconds) * expose pending queue size to let users check and request delay ** python: storm.getPendingQueueSize() ** ruby: Storm::Protocol.get_pending_queue_size() ---- > [multilang] Introduce overflow control mechanism > ------------------------------------------------ > > Key: STORM-756 > URL: https://issues.apache.org/jira/browse/STORM-756 > Project: Apache Storm > Issue Type: Improvement > Affects Versions: 0.10.0, 0.9.4, 0.11.0 > Reporter: Jungtaek Lim > Assignee: Jungtaek Lim > > It's from STORM-738, > https://issues.apache.org/jira/browse/STORM-738?focusedCommentId=14394106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14394106 > A. ShellBolt side control > We can modify ShellBolt to have sent tuple ids list, and stop sending tuples > when list exceeds configured max value. In order to achieve this, subprocess > should notify "tuple id is complete" to ShellBolt. > * It introduces new commands for multi-lang, "proceed" (or better name) > * ShellBolt stores in-progress-of-processing tuples list. > * Its overhead could be big, subprocess should always notify to ShellBolt > when any tuples are processed. > B. subprocess side control > We can modify subprocess to check pending queue after reading tuple. > If it exceeds configured max value, subprocess can request "delay" to > ShellBolt for slowing down. > When ShellBolt receives "delay", BoltWriterRunnable should stop polling > pending queue and continue polling later. > How long ShellBolt wait for resending? Its unit would be "delay time" or > "tuple count". I don't know which is better yet. > * It introduces new commands for multi-lang, "delay" (or better name) > * I don't think it would be introduced soon, but subprocess can request delay > based on own statistics. (ex. pending tuple count * average tuple processed > time for time unit, average pending tuple count for count unit) > ** We can leave when and how much to request "delay" to user. User can make > his/her own algorithm to control flooding. > In my opinion B seems to more natural cause current issue is by subprocess > side so it would be better to let subprocess overcome it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)