I was doing some research on this topic. I checked the code a while back and from what I understood this grouping balances the tuples according to the current queue of each executor.
For example Bolt A have 4 executors 2 with a queue at 50% capacity and 2 at 40% capacity. The distribution would be proportional to "left capacity on the queue" (1 - capacity) so for executor 1 and 2 it would be: 0.5/(0.5*2 +0.6*2) . I think it was built taking heterogeneous clusters in consideration. In cases where some machines are faster than others shuffle grouping perform very poorly since it does not take into consideration that executors have different latencies. You might expect that slower machine will built a queue faster on its executors so this balancing would address this limitation. I did not develop or tested this so I can't assure you it will work like this, but I think this the idea behind it. 2016-04-26 10:46 GMT-05:00 Tom Brown <[email protected]>: > Hi Yamini, > > Your question piqued my interest, so I decided to see what I could figure > out. After reading this JIRA > https://issues.apache.org/jira/browse/STORM-162, I am still confused. > Hopefully someone else on the list will be able to help us. > > Is LoadAwareShuffleGrouping a way of changing a basic shuffle operation > (e.g. randomized round robin) to include downstream load as an input? Or is > it an extension of a fields grouping that could allow the same tuple to be > sent to different downstream tasks (A or B) depending on which has the > higher load? > > --Tom > > On Tue, Apr 26, 2016 at 8:53 AM, Yamini Joshi <[email protected]> > wrote: > >> Hello everyone! >> >> I am new to Storm and I was looking into different stream grouping when I >> came across LoadAwareShuffleGrouping. Can someone tell me what it is >> exactly? Has it been included in the latest storm build? All my google >> searches on this point to JIRA tickets. >> >> Any help is appreciated. >> >> Thank you. >> >> >> >> > -- Rodrigo Valladares Cotta Master's Student, Computer Science University of Nebraska-Lincoln
