Thanks Harsh! your reply helps me a lot. Kun Ling
On Fri, May 10, 2013 at 1:26 PM, Harsh J <[email protected]> wrote: > The task itself moves it when it receives a commitTask message. See > the OutputCommitter class: > > http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/OutputCommitter.html#commitTask(org.apache.hadoop.mapred.TaskAttemptContext) > > On Fri, May 10, 2013 at 8:49 AM, Ling Kun <[email protected]> wrote: > > Dear all, > > > > I am looking into the MR work flow, and want to know more details > about > > the reduce output data copy . > > > > Here is my question. > > > > For the DFSIO test or some other MR jobs. Each reduce task will run > on a > > TT, and generate files to some dirs named like this: " > > XXX//_temporary/_attempt_201305101045_0005_r_000000_0/", there will also > be > > a result file named part-00000. > > > > After the reducer done the task. the reducer output data part-00000 > should > > be moved from the local disk to the HDFS. > > > > My question is: Is that the time that when reducer finish the task that > > part-00000 will be copied to the HDFS? Who make this file copy happen? > The > > Reducer child? The TaskTracker which run the reduce task? Or the > JobTracker? > > > > Thanks, > > > > yours, > > Kun Ling > > > > -- > > http://www.lingcc.com > > > > -- > Harsh J > -- http://www.lingcc.com
