Hi Aaron, thanks for the explanation, I also find it very helpful. On Mon, Dec 9, 2013 at 9:28 PM, Aaron Davidson <[email protected]> wrote:
> If you have N map partitions and R reducers, we create N*R files on disk > across the cluster in order to do the group by. Do you mind giving a link or explaining why N*R files are created? Thanks! Grega
