Hi, When hadoop is running in cluster, the output of the Reducers are saved in HDFS. The MapReduce have also location awareness on where is saved the data?
For example, we've TT1 running in Machine1, and TT2 running in Machine2. The replication of HDFS is 3. The Reduce Task RT1 is running in TT1. So, when the reducer saves output in HDFS, 2 replicas of the output goes to TT1 and the third one goes to TT2? Is this what happens? Thanks, -- Pedro