Stig Rohde Døssing commented on STORM-2106:

[~jfenc91] I don't think it works that way. doSeekRetriableTopicPartitions only 
seeks to the committed offset if retryService.retriableTopicPartitions returns 
x's TopicPartition (since the for-loop otherwise doesn't include x's 
TopicPartition), which it only does if there are failed tuples ready for retry 
on that TopicPartition. When x is ready for retry, 
retryService.retriableTopicPartitions will return x's TopicPartition, the 
consumer seeks to x and the consumer is polled. x should then be emitted. When 
x is emitted, emitTupleIfNotEmitted removes the messageId from retryService, 
which should prevent retryService.retriableTopicPartitions from returning x's 
TopicPartition until x (or some other message on x's TopicPartition) fails 

While x is still processing the spout shouldn't seek back to x (and doesn't as 
far as I can tell). Is it possible that you were limited by something else, 
like maxSpoutPending, retrying a large number of tuples or partition 
reassignments/worker restarts? 


> Storm Kafka Client is paused while failed tuples are replayed
> -------------------------------------------------------------
>                 Key: STORM-2106
>                 URL: https://issues.apache.org/jira/browse/STORM-2106
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: Jeff Fenchel
> With the changes in STORM-2087, the kafka 10 spout will limited to emitting 
> tuples that are within the poll() size for kafka. This means that if the 
> first tuple in a batch from kafka is failed, the spout will not emit more 
> than the size of the batch from kafka until the tuple is either processed 
> successfully or given up on. This behavior is exacerbated by the exponential 
> backoff retry policy.  
> There probably needs to be bookkeeping for the next emittable offset.

This message was sent by Atlassian JIRA

Reply via email to