It is responsible for a subset of shuffle blocks. MapTasks split up their
data, creating one shuffle block for every reducer. During the shuffle
phase, the reducer will fetch all shuffle blocks that were intended for it.


On Sun, Nov 10, 2013 at 9:38 PM, Umar Javed <umarj.ja...@gmail.com> wrote:

> I was wondering how does the scheduler assign the ShuffledRDD locations to
> the reduce tasks? Say that you have 4 reduce tasks, and a number of shuffle
> blocks across two machines. Is each reduce task responsible for a subset of
> individual keys or a subset of shuffle blocks?
>
> Umar
>

Reply via email to