Re: Use case of localOrShuffleGrouping?

Nathan Leung Wed, 01 Jul 2015 13:01:25 -0700

If your bolts are evenly spread and there is at least one on each worker
using localOrShuffleGrouping can be a huge optimization.

Consider this example.  You have 10 bolt A tasks, 10 bolt B tasks, and 10
workers.  Each worker has 1 of each task.  Bolt b subscribes to bolt a.  In
shuffle grouping, you have 1/10 chance that any given tuple will stay
within the worker.  90% of your inter-bolt traffic must be serialized, sent
over the network, and deserialized.  With localOrShuffleGrouping, that
number is 0%.

On Wed, Jul 1, 2015 at 3:15 PM, Abhishek Raj <[email protected]>
wrote:

> I understand the difference between "shuffle grouping" and "local or
> shuffle grouping". But I haven't yet come across any instances where I
> would prefer using localOrShuffleGrouping instead of shuffleGrouping. The
> way I see it, shuffleGrouping helps increasing the concurrency if the tasks
> of a given component are distributed across several workers.
>
> Could anyone throw some light on why the additional grouping method
> "localOrShuffleGrouping" was created and probably some use cases?
>
> Thanks!
> Abhishek.
>

Re: Use case of localOrShuffleGrouping?

Reply via email to