Thanks Ambud, I did read some very good things about acking mechanism in Storm but I am not sure it explains why point to point checking is expensive.
Consider the example: Spout--> BoltA--->BoltB. If BoltB fails, it will report failure to the acker. If the acker can ask the Spout to replay, then why can't the acker ask the parent of BoltB to replay at this point? I don't think keeping parent of a bolt could be expensive. On a related note, I am a little confused about a statement "When a new tupletree is born, the spout sends the XORed edge-ids of each tuple recipient, which the acker records in its pending ledger" in Acking-framework-implementation.html <http://storm.apache.org/releases/current/Acking-framework-implementation.html> . How does the spout know before hand which bolts would receive the tuple? Bolts forward tuples to other bolts based on groupings and dynamically generated fields. How does spout know what fields will be generated and which bolts will receive the tuples? If it does not know that, then how does it send the XOR of each tuple recipient in a tuple's path because each tuple's path will be different (I think, not sure though). Thx, T.I. On Tue, Sep 13, 2016 at 6:37 PM, Ambud Sharma <[email protected]> wrote: > Here is a post on it https://bryantsai.com/fault- > tolerant-message-processing-in-storm/. > > Point to point tracking is expensive unless you are using transactions. > Flume does point to point transfers using transactions. > > On Sep 13, 2016 3:27 PM, "Tech Id" <[email protected]> wrote: > >> I agree with this statement about code/architecture but in case of some >> system outages, like one of the end-points (Solr, Couchbase, Elastic-Search >> etc.) being down temporarily, a very large number of other fully-functional >> and healthy systems will receive a large number of duplicate replays >> (especially in heavy throughput topologies). >> >> If you can elaborate a little more on the performance cost of tracking >> tuples or point to a document reflecting the same, that will be of great >> help. >> >> Best, >> T.I. >> >> On Tue, Sep 13, 2016 at 12:26 PM, Hart, James W. <[email protected]> wrote: >> >>> Failures should be very infrequent, if they are not then rethink the >>> code and architecture. The performance cost of tracking tuples in the way >>> that would be required to replay at the failure is large, basically that >>> method would slow everything way down for very infrequent failures. >>> >>> >>> >>> *From:* S G [mailto:[email protected]] >>> *Sent:* Tuesday, September 13, 2016 3:17 PM >>> *To:* [email protected] >>> *Subject:* Re: How will storm replay the tuple tree? >>> >>> >>> >>> Hi, >>> >>> >>> >>> I am a little curious to know why we begin at the spout level for case 1. >>> >>> If we replay at the failing bolt's parent level (BoltA in this case), >>> then it should be more performant due to a decrease in duplicate processing >>> (as compared to whole tuple tree replays). >>> >>> >>> >>> If BoltA crashes due to some reason while replaying, only then the Spout >>> should receive this as a failure and whole tuple tree should be replayed. >>> >>> >>> >>> This saving in duplicate processing will be more visible with several >>> layers of bolts. >>> >>> >>> >>> I am sure there is a good reason to replay the whole tuple-tree, and >>> want to know the same. >>> >>> >>> >>> Thanks >>> >>> SG >>> >>> >>> >>> On Tue, Sep 13, 2016 at 10:22 AM, P. Taylor Goetz <[email protected]> >>> wrote: >>> >>> Hi Cheney, >>> >>> >>> >>> Replays happen at the spout level. So if there is a failure at any point >>> in the tuple tree (the tuple tree being the anchored emits, unanchored >>> emits don’t count), the original spout tuple will be replayed. So the >>> replayed tuple will traverse the topology again, including unanchored >>> points. >>> >>> >>> >>> If an unanchored tuple fails downstream, it will not trigger a replay. >>> >>> >>> >>> Hope this helps. >>> >>> >>> >>> -Taylor >>> >>> >>> >>> >>> >>> On Sep 13, 2016, at 4:42 AM, Cheney Chen <[email protected]> wrote: >>> >>> >>> >>> Hi there, >>> >>> >>> >>> We're using storm 1.0.1, and I'm checking through http://storm.apache.or >>> g/releases/1.0.1/Guaranteeing-message-processing.html >>> >>> >>> >>> Got questions for below two scenarios. >>> >>> Assume topology: S (spout) --> BoltA --> BoltB >>> >>> 1. S: anchored emit, BoltA: anchored emit >>> >>> Suppose BoltB processing failed w/ ack, what will the replay be, will it >>> execute both BoltA and BoltB or only failed BoltB processing? >>> >>> >>> >>> 2. S: anchored emit, BoltA: unanchored emit >>> >>> Suppose BoltB processing failed w/ ack, replay will not happen, correct? >>> >>> >>> >>> -- >>> >>> Regards, >>> Qili Chen (Cheney) >>> >>> E-mail: [email protected] >>> MP: (+1) 4086217503 >>> >>> >>> >>> >>> >> >>
