I met the save problem: -but the copy rate is only (0.1MBps) -10MBps bandwidth between using scp
May be the bandwidth affect the copy rate. Can any one help us. 2007/8/4, Joydeep Sen Sarma <[EMAIL PROTECTED]>: > > I have a fairly simple job with a map, a local combiner and a reduce. > The combiner and the reduce do the equivalent of a group_concat (mysql). > > > I have horrible performance in the reduce stage: > - the map jobs are done > - all the reduce jobs claim they are copying data - but the copy rate is > abysmal (0.5MBps) > - checked the network topology - everything's on GigE and on same > switch. (80 machine cluster) > - seeing 50+ MBps bandwidth between any pair using scp > - when I look at the machines where reduce is running - vmstat says 0% > cpu util. > > A sample reducetask log is below. Job conf: 64 way reduce. I specified > the map tasks to the same number - but hadoop is anyway creating 386 map > tasks. > > Anyone has some quick hints on what could be going wrong? > > Thanks, > > Joydeep > > 2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 2 known map output location(s); scheduling... > 2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 > dup hosts) > 2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Need 1 map output(s) > 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map > outputs from previous failures > 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 2 known map output location(s); scheduling... > 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 > dup hosts) > 2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Need 1 map output(s) > 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map > outputs from previous failures > 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 2 known map output location(s); scheduling... > 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 > dup hosts) > 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Need 1 map output(s) > 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map > outputs from previous failures > 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 2 known map output location(s); scheduling... > 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 > dup hosts) > 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Need 1 map output(s) > 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map > outputs from previous failures > 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 2 known map output location(s); scheduling... > 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 > dup hosts) > 2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Need 1 map output(s) > 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map > outputs from previous failures > 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Got 2 known map output location(s); scheduling... > 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 > dup hosts) > 2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask: > task_0169_r_000010_0 Need 1 map output(s) > > > >
