[ 
https://issues.apache.org/jira/browse/STORM-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942215#comment-16942215
 ] 

Evgheni Melman commented on STORM-3514:
---------------------------------------

I use acking mostly for throttling the spout, as whole-topology propagation 
varries from fractions of milliseconds to multiple minutes, yet spout produces 
them at millions per second if left unthrottled. Most messaging is 
inner-worker, so network related issues shouldn't be a problem. Failing of 
tuples on expected errors is done manually, besides that, the topology is 
intentionally fail-fast.

 

What I'm seeing is a complete stall of a spout withing 1st minute of running. 
Looking at emited touples, I see that initially everything works fine until the 
propagation exceedes the "topology.message.timeout.secs" when __acker stops 
acking to the spout, even thought bolts keep executing and acking to the 
__acker. What I'm left with is a tuple, that has been processed and acked in 
the bolt, but not acked to the spout. As I have timeouts disabled, the spout 
does not reemit (which is not even needed, as the touple was already processed 
and acked), but as the spout didnt receive the ack from the __acker, it keeps 
waiting for queue to free up (which already happened, as the touple was 
processed and acked by the bolt). Setparately setting timeout.secs to a 
non-default and much higher value lets the topology run as it should. This all 
tells me that there is a bug in the __acker code that prevents it from acking 
touples that exceed "topology.message.timeout.secs" even when 
"topology.enable.message.timeouts" is set to false.

> "topology.enable.message.timeouts: false" has no effect on ackers
> -----------------------------------------------------------------
>
>                 Key: STORM-3514
>                 URL: https://issues.apache.org/jira/browse/STORM-3514
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 1.2.3
>            Reporter: Evgheni Melman
>            Priority: Major
>
> "topology.enable.message.timeouts: false" does prevent tuples from being 
> failed if not acked in "topology.message.timeout.secs" seconds, but it still 
> prevents __ackers from acking anchored tuples to the spout. When used with 
> "topology.max.spout.pending" this effectively stalls the spout completely as 
> the tuple is neither failed, nor acked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to