After updating the hadoop trunk today, I am having problem at the reducing phase. Some of the reducers stock in the copying stage (very end of copying) and they keep reporting the same status, even when I kill the related tasktracker, the job traker still reports the copying. Here is the log:
2007-02-27 22:08:26,388 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Got 24 known map output location(s); scheduling... 2007-02-27 22:08:26,388 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Scheduled 0 of 24 known outputs (24 slow hosts and 0 dup hosts) 2007-02-27 22:08:27,204 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:27,204 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:28,214 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:28,214 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:29,224 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:29,224 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:30,114 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Need 11 map output(s) 2007-02-27 22:08:30,114 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Need 234 map output location(s) 2007-02-27 22:08:30,116 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Got 0 new map outputs from jobtracker and 0 map outputs from previous failures 2007-02-27 22:08:30,116 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Got 11 known map output location(s); scheduling... 2007-02-27 22:08:30,116 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Scheduled 0 of 11 known outputs (11 slow hosts and 0 dup hosts) 2007-02-27 22:08:30,234 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:30,234 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:31,244 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:31,244 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:31,394 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Need 24 map output(s) 2007-02-27 22:08:31,394 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Need 133 map output location(s) 2007-02-27 22:08:31,395 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Got 0 new map outputs from jobtracker and 0 map outputs from previous failures 2007-02-27 22:08:31,395 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Got 24 known map output location(s); scheduling... 2007-02-27 22:08:31,395 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Scheduled 0 of 24 known outputs (24 slow hosts and 0 dup hosts) 2007-02-27 22:08:32,254 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:32,254 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:33,264 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:33,264 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:34,274 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:34,274 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:35,124 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Need 11 map output(s) 2007-02-27 22:08:35,124 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Need 234 map output location(s) 2007-02-27 22:08:35,219 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Got 0 new map outputs from jobtracker and 0 map outputs from previous failures 2007-02-27 22:08:35,219 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Got 11 known map output location(s); scheduling... 2007-02-27 22:08:35,219 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000111_0 Scheduled 0 of 11 known outputs (11 slow hosts and 0 dup hosts) 2007-02-27 22:08:35,284 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:35,284 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:36,294 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000224_0 0.33083335% reduce > copy (3176 of 3200 at 1.94 MB/s) > 2007-02-27 22:08:36,294 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000111_0 0.3321875% reduce > copy (3189 of 3200 at 0.40 MB/s) > 2007-02-27 22:08:36,404 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Need 24 map output(s) 2007-02-27 22:08:36,404 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Need 133 map output location(s) 2007-02-27 22:08:36,422 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000224_0 Got 0 new map outputs from jobtracker and 0 map outputs from previous
