Excellent. Thank you for your quick responses! On Fri Nov 14 2014 at 4:23:06 PM Nathan Leung <[email protected]> wrote:
> I agree that shuffleGrouping will probably fix this problem for you. How > much care is required when using localOrShuffleGrouping depends on how many > bolts / spouts / workers you have, but yes in general I would agree that > care should be taken. In general if the upstream component has fewer > executors than your number of workers, it's better to use shuffle, > otherwise localOrShuffle will probably give you better performance. > > On Fri, Nov 14, 2014 at 4:16 PM, Luke Rohde <[email protected]> wrote: > >> Thanks, so if I read you correctly this immediate problem should be >> alleviated by using shuffleGrouping on the terminal bolt. >> In general, though, it sounds like care should be taken with >> localOrShuffle to avoid this sort of scenario. >> >> On Fri Nov 14 2014 at 4:08:57 PM Nathan Leung <[email protected]> wrote: >> >>> if you have 2 workers, and 1 bolt with 1 executor that feeds into the >>> terminal bolt, with the terminal bolt subscribing using >>> localOrShuffleGrouping, then the upstream bolt will send all of its tuples >>> to the terminal bolts in the same worker process (due to >>> localOrShuffleGrouping) and the other half of the terminal bolts will be >>> idle. Without knowing more details it's hard to say if this is what you're >>> seeing. >>> >>> On Fri, Nov 14, 2014 at 4:02 PM, Luke Rohde <[email protected]> >>> wrote: >>> >>>> Hi, I have a topology that’s bottlenecked right now by a terminal bolt >>>> that’s writing small batches to an endpoint. I’ve increased the number of >>>> executors several times so that it’s no longer bottlenecked there, but I >>>> still notice when there’s a traffic spike that despite capacity hovering >>>> around 1.0, probably half of the executors are idle. >>>> >>>> Can anyone give insight as to why this might be? I’ve read the docs on >>>> storm parallelism and can’t understand why this is happening. FWIW, all of >>>> the non-fieldsGrouping bolts are declared using localOrShuffleGrouping - >>>> perhaps this has something to do with it? I have a feeling that this is the >>>> core of the problem, but it’s not clear to me why exactly you wouldn’t use >>>> localOrShuffle over Shuffle. >>>> >>>> Thanks, Luke >>>> >>> >>> >
