[
https://issues.apache.org/jira/browse/STORM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006818#comment-15006818
]
ASF GitHub Bot commented on STORM-756:
--------------------------------------
Github user revans2 commented on the pull request:
https://github.com/apache/storm/pull/532#issuecomment-157076377
@HeartSaVioR I would like to see ABP work correctly be default on the Shell
bolt, and have it work correctly when TOPOLOGY_SHELLBOLT_MAX_PENDING is off,
but I don't think a change to the protocol is going work. The buffer that
fills up is in java, not in the external process. Having the ability for the
external process to throttle only really makes since if the external process is
buffering tuples somehow and then it can just stop reading to let us know to
stop as well.
Perhaps what we can just have a default value for
TOPOLOGY_SHELLBOLT_MAX_PENDING. The only reason we would want to turn it off
completely is that we have a deadlock if taskIds are requested. Here we write
a message back into _pendingWrites, which with a hard coded limit could cause
the only thread that reads from _pendingWrites to block trying to write to it.
If we can fix this bug then I would say we just set a default value.
> [multilang] Introduce overflow control mechanism
> ------------------------------------------------
>
> Key: STORM-756
> URL: https://issues.apache.org/jira/browse/STORM-756
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-multilang
> Affects Versions: 0.10.0, 0.9.4, 0.11.0
> Reporter: Jungtaek Lim
> Assignee: Jungtaek Lim
>
> It's from STORM-738,
> https://issues.apache.org/jira/browse/STORM-738?focusedCommentId=14394106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14394106
> A. ShellBolt side control
> We can modify ShellBolt to have sent tuple ids list, and stop sending tuples
> when list exceeds configured max value. In order to achieve this, subprocess
> should notify "tuple id is complete" to ShellBolt.
> * It introduces new commands for multi-lang, "proceed" (or better name)
> * ShellBolt stores in-progress-of-processing tuples list.
> * Its overhead could be big, subprocess should always notify to ShellBolt
> when any tuples are processed.
> B. subprocess side control
> We can modify subprocess to check pending queue after reading tuple.
> If it exceeds configured max value, subprocess can request "delay" to
> ShellBolt for slowing down.
> When ShellBolt receives "delay", BoltWriterRunnable should stop polling
> pending queue and continue polling later.
> How long ShellBolt wait for resending? Its unit would be "delay time" or
> "tuple count". I don't know which is better yet.
> * It introduces new commands for multi-lang, "delay" (or better name)
> * I don't think it would be introduced soon, but subprocess can request delay
> based on own statistics. (ex. pending tuple count * average tuple processed
> time for time unit, average pending tuple count for count unit)
> ** We can leave when and how much to request "delay" to user. User can make
> his/her own algorithm to control flooding.
> In my opinion B seems to more natural cause current issue is by subprocess
> side so it would be better to let subprocess overcome it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)