Thank you, Liyin, On Fri, Aug 3, 2012 at 7:33 AM, 梁李印 <liyin.lian...@aliyun-inc.com> wrote:
> When a map task is done, its output is always flushed to the disk and > merged > to one file. > The benefit is that if the reducer is failed, the map need not to re-run. > > Liyin Liang > > -----邮件原件----- > 发件人: Satheesh Kumar [mailto:nks...@gmail.com] > 发送时间: 2012年8月3日 21:23 > 收件人: common-user@hadoop.apache.org > 主题: MapReduce shuffle question > > Team, can someone please clarify the following question? > > In the map phase, the map output is written to the local disk. And in the > shuffle phase, the map output partitions are transferred to reduce nodes > using http. So, my question is assuming there are no spills (data set is > small enough to accommodate this), will the map output be transferred > directly from memory to the reduce nodes using http without a disk access > to write the map output? Or, is the map output always flushed to the disk > before transferred to reduce nodes? > > Appreciate the help. > > Thanks, > Satheesh > >