For any shuffle operation, groupByKey, etc. it does write map output to disk before performing the reduce task on the data.
On Thu, Jan 16, 2014 at 4:03 PM, suman bharadwaj <[email protected]>wrote: > Hi, > > I'm new to spark. And wanted to understand more on how shuffle works in > spark > > In Hadoop map reduce, while performing a reduce operation, the > intermediate data from map gets written to disk. How does the same happen > in Spark ? > > Does spark write the intermediate data to disk ? > > Thanks in advance. > > Regards, > SB >
