[jira] [Commented] (STORM-2106) Storm Kafka Client is paused while failed tuples are replayed

JIRA Tue, 20 Sep 2016 11:08:33 -0700

    [ 
https://issues.apache.org/jira/browse/STORM-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507304#comment-15507304
 ]


Stig Rohde Døssing commented on STORM-2106:
-------------------------------------------

Are you sure this is an issue? I get that the spout will restart at the failed 
offset if it crashes or partitions are reassigned and another spout has to 
restart at the last committed offset, but as far as I can tell it should 
progress past the failed tuple when it's reemitted or the retry service decides 
to delay it. doSeekRetriableTopicPartitions only seeks on those partitions that 
have failed tuples ready to retry (i.e. whose retry delays have expired).

If you have a sequence of offsets x...y and x has failed, x-1 was committed and 
everything up to y has been acked, doSeekRetriableTopicPartitions should seek 
to x and retry it. The offsets between x and y will then be skipped by 
emitTupleIfNotEmitted because they've already been acked. On the next call to 
nextTuple after skipping ]x...y[, the spout should poll Kafka again at y 
because waitingToEmit is empty (x was emitted and the others were already 
acked). If x gets acked there's no problem. If x fails again and the retry 
service decides to delay it, doSeekRetriableTopicPartitions won't seek back to 
x until the backoff expires. In the meantime new offsets get processed. When 
x's retry delay expires, the spout will just seek back to x, emit it again and 
skip past ]x...y...z[ (z being the last message it managed to ack before 
seeking back to x).

There are a few other factors in play here like maxUncommittedOffsets, but I 
don't really see the issue you're describing.

> Storm Kafka Client is paused while failed tuples are replayed
> -------------------------------------------------------------
>
>                 Key: STORM-2106
>                 URL: https://issues.apache.org/jira/browse/STORM-2106
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: Jeff Fenchel
>
> With the changes in STORM-2087, the kafka 10 spout will limited to emitting 
> tuples that are within the poll() size for kafka. This means that if the 
> first tuple in a batch from kafka is failed, the spout will not emit more 
> than the size of the batch from kafka until the tuple is either processed 
> successfully or given up on. This behavior is exacerbated by the exponential 
> backoff retry policy.  
> There probably needs to be bookkeeping for the next emittable offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (STORM-2106) Storm Kafka Client is paused while failed tuples are replayed

Reply via email to