I agree with this statement about code/architecture but in case of some system outages, like one of the end-points (Solr, Couchbase, Elastic-Search etc.) being down temporarily, a very large number of other fully-functional and healthy systems will receive a large number of duplicate replays (especially in heavy throughput topologies).
If you can elaborate a little more on the performance cost of tracking tuples or point to a document reflecting the same, that will be of great help. Best, T.I. On Tue, Sep 13, 2016 at 12:26 PM, Hart, James W. <[email protected]> wrote: > Failures should be very infrequent, if they are not then rethink the code > and architecture. The performance cost of tracking tuples in the way that > would be required to replay at the failure is large, basically that method > would slow everything way down for very infrequent failures. > > > > *From:* S G [mailto:[email protected]] > *Sent:* Tuesday, September 13, 2016 3:17 PM > *To:* [email protected] > *Subject:* Re: How will storm replay the tuple tree? > > > > Hi, > > > > I am a little curious to know why we begin at the spout level for case 1. > > If we replay at the failing bolt's parent level (BoltA in this case), then > it should be more performant due to a decrease in duplicate processing (as > compared to whole tuple tree replays). > > > > If BoltA crashes due to some reason while replaying, only then the Spout > should receive this as a failure and whole tuple tree should be replayed. > > > > This saving in duplicate processing will be more visible with several > layers of bolts. > > > > I am sure there is a good reason to replay the whole tuple-tree, and want > to know the same. > > > > Thanks > > SG > > > > On Tue, Sep 13, 2016 at 10:22 AM, P. Taylor Goetz <[email protected]> > wrote: > > Hi Cheney, > > > > Replays happen at the spout level. So if there is a failure at any point > in the tuple tree (the tuple tree being the anchored emits, unanchored > emits don’t count), the original spout tuple will be replayed. So the > replayed tuple will traverse the topology again, including unanchored > points. > > > > If an unanchored tuple fails downstream, it will not trigger a replay. > > > > Hope this helps. > > > > -Taylor > > > > > > On Sep 13, 2016, at 4:42 AM, Cheney Chen <[email protected]> wrote: > > > > Hi there, > > > > We're using storm 1.0.1, and I'm checking through http://storm.apache. > org/releases/1.0.1/Guaranteeing-message-processing.html > > > > Got questions for below two scenarios. > > Assume topology: S (spout) --> BoltA --> BoltB > > 1. S: anchored emit, BoltA: anchored emit > > Suppose BoltB processing failed w/ ack, what will the replay be, will it > execute both BoltA and BoltB or only failed BoltB processing? > > > > 2. S: anchored emit, BoltA: unanchored emit > > Suppose BoltB processing failed w/ ack, replay will not happen, correct? > > > > -- > > Regards, > Qili Chen (Cheney) > > E-mail: [email protected] > MP: (+1) 4086217503 > > > > >
