On Mon, Mar 2, 2015 at 6:47 AM, 宋轶 <[email protected]> wrote:

> The problem of it is that each mapper will generate too much intermediate
> data, and the network will be the bottleneck in Shuffle phase


This would prevent multiple passes over the input data.  Is there a
difference in the amount of shuffled data from the amount that would be
shuffled by multiple map-reduce steps?

Reply via email to