Thank you Nathan and Kishore. On Fri, Aug 7, 2015 at 4:37 PM, Nathan Leung <[email protected]> wrote:
> It's even worse, you have information for both bolts sent twice, instead > of information for one bolt sent once, so assuming same message size and > same frequency of messages for both bolts you are sending 4x data. Use > option 2. > On Aug 7, 2015 1:18 PM, "Kishore Senji" <[email protected]> wrote: > >> I also think option 2 is better. There is another reason for choosing >> this other than being a smaller payload that goes across. Today it could be >> that A bolt splits the stream 1:1 for B & C. But later if it becomes 1:2 >> for example, having a different stream for C allows you to scale Bolt C >> (more parallelism) to improve the throughput. If you had only one Stream, >> then you can give a 1 message to B and 2 messages (as a list) to C, but >> there is no way to scale C (even if you add more parallelism, the >> throughput wouldn't improve as it would have to process 2 messages in >> serial) >> >> I do not think there is a cost to having more streams and so choosing the >> second option might be better. >> >> On Fri, Aug 7, 2015 at 12:01 PM, Javier Gonzalez <[email protected]> >> wrote: >> >>> Hi all, >>> >>> Suppose I have a bolt A that has to send information to two bolts B and >>> C. Each bolt must receive different information from the original A bolt. >>> Which of these strategies is more efficient? >>> >>> Strategy 1: >>> - have A declare a single output stream, with fields "forB" and "forC". >>> - Emit all the information in a single tuple, putting the information >>> for Bolt B in "forB" and the information for bolt C in "forC". >>> - Have Bolt B and Bolt C subscribe to Bolt A‘s single output channel. >>> - In Bolt B and Bolt C execute method read only the relevant part of the >>> input tuple. >>> >>> Strategy 2: >>> - have A declare two output streams, “streamB” and “streamC“. >>> - emit one tuple with the information for bolt B in streamB, and one in >>> with the >>> information for Bolt C in StreamC. >>> - Have each bolt subscribe only to their relevant stream. >>> - Each bolt works as usual with their payload in their execute methods. >>> >>> A priori I would think Strategy 2 is better (as we would be emitting >>> smaller tuples), but I'm not sure if there's a hidden cost/benefit in >>> having multiple subscribers to a single stream >>> >>> Thank you, >>> Javier >>> >> >> -- Javier González Nicolini
