Very nice discussion !

I have also been wanting to see a feature something similar to Ravi's
comment above:

"*There is one thing i am looking forward from Storm is to inform Spout
about what kind of failure it was*. i.e. if it was ConnectionTimeout or
ReadTimeout etc, that means if i retry it may pass. But say it was null
pointer exception(java world) , i know the data which is being expected is
not there and my code is not handling that scenario, so either i will have
to change code or ask data provider to send that field, but retry wont help
me."


I think we need

1) Add a new method *failWithoutRetry(Tuple, Exception)* in the collector.
2) Provide the ability to *configure a dead-letter data-store in the spout*
for failed messages reported by #1 above.

The configurable data-store should support kafka, solr and redis to
begin-with (Plus the option to implement one's own and dropping a jar file
in the classpath).

Such a feature would benefit all the spouts.

Benefits:
1) Topologies will not block replaying the same doomed-to-fail tuples.
2) Users can set alerts on dead-letters and find out easily actual problems
in their topologies rather than analyze all failed tuples only to find that
they failed because of a temporary network glitch.
3) Since the entire Tuple is put into the dead-letter, all the data is
available for retrying after fixing the topology code.

Thx,
SG



On Wed, Sep 14, 2016 at 7:25 AM, Hart, James W. <[email protected]> wrote:

> In my testing when a tuple was replayed by a spout, every kafka message
> from the replayed one to the end was replayed.  That’s why all bolts need
> to be idempotent so that replays do not cause work to be done twice.  I
> think it has to do with kafka tracking the offset of the last acked message
> in a topic, not the actual ack of every message individually.  This is a
> simplistic view as it’s a lot more complicated than this.
>
>
>
> If anybody can confirm this, please respond as it was a surprise to me and
> cause me a couple of days of testing when I encountered it.
>
>
>
>
>
> *From:* [email protected] [mailto:[email protected]]
> *Sent:* Tuesday, September 13, 2016 9:22 PM
>
> *To:* user
> *Subject:* Re: Re: How will storm replay the tuple tree?
>
>
>
> Yes, only the failed tuple are replayed, but the whole batch will be held.
>
>
>
> So, if the tuple failed forever, the batch will be held forever?
>
>
>
> I am just not clear  the tuple itself or the batch which owns the tuple
> will be held in spout.
>
>
>
>
> ------------------------------
>
> Josh
>
>
>
>
>
>
>
> *From:* Ambud Sharma <[email protected]>
>
> *Date:* 2016-09-14 09:10
>
> *To:* user <[email protected]>
>
> *Subject:* Re: Re: How will storm replay the tuple tree?
>
> No as per the code only individual messages are replayed.
>
>
>
> On Sep 13, 2016 6:09 PM, "[email protected]" <[email protected]>
> wrote:
>
> Hi:
>
>
>
> I'd like to make clear on something about Kafka-spout referring to ack.
>
>
>
> For example, kafka-spout fetches offset 5000-6000 from Kafka server, but
> one tuple whose offset is 5101 is failed by a bolt, the whole batch of
> 5000-6000 will be remain in kafka-spout until the 5101 tuple will be acked.
> If the 5101 tuple can not be acked for a long time, the batch 5000-6000
> will remain for a long time, and the kafka-spout will stop to fetch data
> from kafka in these time.
>
>
>
> Am I right?
>
>
>
>
> ------------------------------
>
> Josh
>
>
>
>
>
>
>
> *From:* Tech Id <[email protected]>
>
> *Date:* 2016-09-14 06:26
>
> *To:* user <[email protected]>
>
> *Subject:* Re: How will storm replay the tuple tree?
>
> I agree with this statement about code/architecture but in case of some
> system outages, like one of the end-points (Solr, Couchbase, Elastic-Search
> etc.) being down temporarily, a very large number of other fully-functional
> and healthy systems will receive a large number of duplicate replays
> (especially in heavy throughput topologies).
>
>
>
> If you can elaborate a little more on the performance cost of tracking
> tuples or point to a document reflecting the same, that will be of great
> help.
>
>
>
> Best,
>
> T.I.
>
>
>
> On Tue, Sep 13, 2016 at 12:26 PM, Hart, James W. <[email protected]> wrote:
>
> Failures should be very infrequent, if they are not then rethink the code
> and architecture.  The performance cost of tracking tuples in the way that
> would be required to replay at the failure is large, basically that method
> would slow everything way down for very infrequent failures.
>
>
>
> *From:* S G [mailto:[email protected]]
> *Sent:* Tuesday, September 13, 2016 3:17 PM
> *To:* [email protected]
> *Subject:* Re: How will storm replay the tuple tree?
>
>
>
> Hi,
>
>
>
> I am a little curious to know why we begin at the spout level for case 1.
>
> If we replay at the failing bolt's parent level (BoltA in this case), then
> it should be more performant due to a decrease in duplicate processing (as
> compared to whole tuple tree replays).
>
>
>
> If BoltA crashes due to some reason while replaying, only then the Spout
> should receive this as a failure and whole tuple tree should be replayed.
>
>
>
> This saving in duplicate processing will be more visible with several
> layers of bolts.
>
>
>
> I am sure there is a good reason to replay the whole tuple-tree, and want
> to know the same.
>
>
>
> Thanks
>
> SG
>
>
>
> On Tue, Sep 13, 2016 at 10:22 AM, P. Taylor Goetz <[email protected]>
> wrote:
>
> Hi Cheney,
>
>
>
> Replays happen at the spout level. So if there is a failure at any point
> in the tuple tree (the tuple tree being the anchored emits, unanchored
> emits don’t count), the original spout tuple will be replayed. So the
> replayed tuple will traverse the topology again, including unanchored
> points.
>
>
>
> If an unanchored tuple fails downstream, it will not trigger a replay.
>
>
>
> Hope this helps.
>
>
>
> -Taylor
>
>
>
>
>
> On Sep 13, 2016, at 4:42 AM, Cheney Chen <[email protected]> wrote:
>
>
>
> Hi there,
>
>
>
> We're using storm 1.0.1, and I'm checking through http://storm.apache.
> org/releases/1.0.1/Guaranteeing-message-processing.html
>
>
>
> Got questions for below two scenarios.
>
> Assume topology: S (spout) --> BoltA --> BoltB
>
> 1. S: anchored emit, BoltA: anchored emit
>
> Suppose BoltB processing failed w/ ack, what will the replay be, will it
> execute both BoltA and BoltB or only failed BoltB processing?
>
>
>
> 2. S: anchored emit, BoltA: unanchored emit
>
> Suppose BoltB processing failed w/ ack, replay will not happen, correct?
>
>
>
> --
>
> Regards,
> Qili Chen (Cheney)
>
> E-mail: [email protected]
> MP: (+1) 4086217503
>
>
>
>
>
>
>
>

Reply via email to