[
https://issues.apache.org/jira/browse/STORM-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182540#comment-14182540
]
ASF GitHub Bot commented on STORM-495:
--------------------------------------
Github user rick-kilgore commented on the pull request:
https://github.com/apache/storm/pull/254#issuecomment-60354121
Hi @wurstmeister. I like the way yours encapsulates the collections into
the separate <tt>FailedMessageHandler</tt> class. That's definitely cleaner.
I have two comments:
<ol>
<li>Exponential backoff for retries seems to be pretty universal, and you
can achieve both the current behavior (instant retries) and constant time
retries by choosing appropriate parameters for the exponential backoff. So I'm
not sure if defining an interface and creating multiple implementations is
worth it for this problem.</li>
<li>Although I can imagine that you might want to set a <tt>maxRetries</tt>
value, it definitely should be possible to tell the spout to retry
indefinitely. The reason is that when you stop retrying, you're generally
going to want to log the error that is happening and probably respond in some
application-specific way. So the act of terminating the retries is really best
left to the application-specific bolt in the topology where the error is
occurring. In my setup, each of my bolts keeps track of retries that it has
caused, and after a certain number of tries it saves some information to an
error kafka queue and acks the message.</li>
</ol>
> Add delayed retries to KafkaSpout
> ---------------------------------
>
> Key: STORM-495
> URL: https://issues.apache.org/jira/browse/STORM-495
> Project: Apache Storm
> Issue Type: Improvement
> Affects Versions: 0.9.3
> Environment: all environments
> Reporter: Rick Kilgore
> Priority: Minor
> Labels: kafka, retry
>
> If a tuple in the topology originates from the KafkaSpout from the
> external/storm-kafka sources, and if a bolt in the topology indicates a
> failure by calling fail() on its OutputCollector, the KafkaSpout will
> immediately retry the message.
> We wish to use this failure and retry behavior in our ingestion system
> whenever we experience a recoverable error from a downstream system, such as
> a 500 or 503 error from a service we depend on. But with the current
> KafkaSpout behavior, doing so results in a tight loop where we retry several
> times over a few seconds and then give up. I want to be able to delay retry
> to give the downstream service some time to recover. Ideally, I would like
> to have configurable, exponential backoff retry.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)