[
https://issues.apache.org/jira/browse/STORM-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270352#comment-14270352
]
Adrian Seungjin Lee commented on STORM-618:
-------------------------------------------
What I intended was the case when tuple times out as [~kabhwan] said. I don't
thinkOur team also uses always-ack strategy to avoid tuple failures but when
tuple times out repeatedly due to external failures, even if corresponding
external failures gets better, kafka spout just halts.
I don't think this is a bug either but it would be better if kafka spout has a
room to avoid this. Sorry for having marked wrong type 'bug'.
> Add spoutconfig option to make kafka spout process messages at most once.
> --------------------------------------------------------------------------
>
> Key: STORM-618
> URL: https://issues.apache.org/jira/browse/STORM-618
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-kafka
> Affects Versions: 0.9.3
> Reporter: Adrian Seungjin Lee
>
> While it's nice for kafka spout to push failed tuple back into a sorted set
> and try to process it again, this way of guaranteed message processing
> sometimes makes situation pretty bad when a failed tuple repeatedly fails in
> downstream bolts since PartitionManager#fill method tries to fetch from that
> offset repeatedly.
> This is a corresponding code snippet.
> private void fill() {
> ...
> if (had_failed) {
> offset = failed.first();
> } else {
> offset = _emittedToOffset;
> }
> ...
> msgs = KafkaUtils.fetchMessages(_spoutConfig, _consumer,
> _partition, offset);
> ...
> So there should be an option for a developer to decide if he wants to process
> failed tuple again or just skip failed tuple. One of the best thing of Storm
> is that spout together with trident can be implemented to guarantee
> at-least-once,exactly-once and at-most-once message processing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)