[
https://issues.apache.org/jira/browse/HADOOP-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468396
]
Nigel Daley commented on HADOOP-949:
------------------------------------
Task tracker log:
... [reduce task launches]
2007-01-27 04:35:49,483 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction: task_0005_r_000000_0
2007-01-27 04:35:49,554 INFO org.apache.hadoop.dfs.DistributedFileSystem:
Initialized InMemoryFileSystem: ramfs://mapoutput10141941 of size (in bytes):
209715200
2007-01-27 04:35:49,555 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Created an InMemoryFileSystem, uri:
ramfs://mapoutput10141941
2007-01-27 04:35:49,575 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 200 map output(s)
2007-01-27 04:35:49,575 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 200 map output location(s)
2007-01-27 04:35:49,576 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Started thread: Map output copy reporter for task
task_0005_r_000000_0
2007-01-27 04:35:49,584 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-27 04:35:49,585 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 1 known map output location(s); scheduling...
2007-01-27 04:35:49,585 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 1 of 1 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-27 04:35:49,585 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Copying task_0005_m_000002_0 output from tt1.com.
2007-01-27 04:35:49,736 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_m_000000_0 0.0% writing [EMAIL PROTECTED]/134217728 ::host = tt2.com
2007-01-27 04:35:49,809 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 done copying task_0005_m_000002_0 output from tt1.com.
2007-01-27 04:35:50,204 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 199 map output(s)
2007-01-27 04:35:50,204 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 199 map output location(s)
2007-01-27 04:35:50,578 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.0016666667% reduce > copy (1 of 200 at 0.00 MB/s) >
... [map outputs get copied]
2007-01-27 04:39:44,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 50 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-27 04:39:44,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 known map output location(s); scheduling...
2007-01-27 04:39:44,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-27 04:39:44,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 153 map output(s)
2007-01-27 04:39:44,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output location(s)
2007-01-27 04:39:44,703 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.07833333% reduce > copy (47 of 200 at 0.00 MB/s) >
2007-01-27 04:39:45,705 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.07833333% reduce > copy (47 of 200 at 0.00 MB/s) >
2007-01-27 04:39:46,707 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.07833333% reduce > copy (47 of 200 at 0.00 MB/s) >
2007-01-27 04:39:47,708 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.07833333% reduce > copy (47 of 200 at 0.00 MB/s) >
2007-01-27 04:39:48,710 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.07833333% reduce > copy (47 of 200 at 0.00 MB/s) >
2007-01-27 04:39:49,268 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 50 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-27 04:39:49,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 known map output location(s); scheduling...
2007-01-27 04:39:49,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-27 04:39:49,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 152 map output(s)
2007-01-27 04:39:49,269 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output location(s)
2007-01-27 04:39:49,712 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08% reduce > copy (48 of 200 at 0.00 MB/s) >
2007-01-27 04:39:50,715 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08% reduce > copy (48 of 200 at 0.00 MB/s) >
2007-01-27 04:39:51,718 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08% reduce > copy (48 of 200 at 0.00 MB/s) >
2007-01-27 04:39:52,721 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08% reduce > copy (48 of 200 at 0.00 MB/s) >
2007-01-27 04:39:53,723 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08% reduce > copy (48 of 200 at 0.00 MB/s) >
2007-01-27 04:39:54,270 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 50 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-27 04:39:54,270 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 known map output location(s); scheduling...
2007-01-27 04:39:54,271 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-27 04:39:54,271 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 151 map output(s)
2007-01-27 04:39:54,271 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output location(s)
2007-01-27 04:39:54,726 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08166666% reduce > copy (49 of 200 at 0.00 MB/s) >
2007-01-27 04:39:55,728 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08166666% reduce > copy (49 of 200 at 0.00 MB/s) >
2007-01-27 04:39:56,730 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08166666% reduce > copy (49 of 200 at 0.00 MB/s) >
2007-01-27 04:39:57,732 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08166666% reduce > copy (49 of 200 at 0.00 MB/s) >
2007-01-27 04:39:58,733 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.08166666% reduce > copy (49 of 200 at 0.00 MB/s) >
2007-01-27 04:39:59,272 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 50 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-27 04:39:59,272 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 known map output location(s); scheduling...
2007-01-27 04:39:59,272 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-27 04:39:59,273 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output(s)
2007-01-27 04:39:59,273 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output location(s)
2007-01-27 04:39:59,735 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:00,737 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:01,739 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:02,741 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:03,743 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:04,272 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 50 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-27 04:40:04,272 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 known map output location(s); scheduling...
2007-01-27 04:40:04,272 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-27 04:40:04,745 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:05,748 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:06,751 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:07,754 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-27 04:40:08,757 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
... [The next chunk is repeated forever]
2007-01-28 02:00:54,509 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output(s)
2007-01-28 02:00:54,509 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Need 150 map output location(s)
2007-01-28 02:00:54,513 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 50 new map outputs from jobtracker and 0 map outputs
from previous failures
2007-01-28 02:00:54,513 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Got 0 known map output location(s); scheduling...
2007-01-28 02:00:54,513 INFO org.apache.hadoop.mapred.TaskRunner:
task_0005_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup
hosts)
2007-01-28 02:00:54,595 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-28 02:00:55,596 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-28 02:00:56,598 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-28 02:00:57,600 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
2007-01-28 02:00:58,602 INFO org.apache.hadoop.mapred.TaskTracker:
task_0005_r_000000_0 0.083333336% reduce > copy (50 of 200 at 0.00 MB/s) >
...[The webui shows the reduce as 8.33% complete]
> reduce hangs after applying HADOOP-248 on 20 node cluster
> ---------------------------------------------------------
>
> Key: HADOOP-949
> URL: https://issues.apache.org/jira/browse/HADOOP-949
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.10.1
> Reporter: Nigel Daley
> Assigned To: Owen O'Malley
>
> After HADOOP-248 was committed, my benchmarks started to hang. Running
> TestDFSIO on 20 nodes, the job's single reduce is hanging. Annodated logs
> from the task tracker running the reduce are included in the next comment. I
> have reverted the patch using:
> cd $trunk_dir
> svn merge -r 500410:500397 .
> and reran some benchmarks. They all pass.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.