[
https://issues.apache.org/jira/browse/HADOOP-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499270
]
Konstantin Shvachko commented on HADOOP-1374:
---------------------------------------------
Debugging TaskTracker being in the loop on Linux.
It falls into loop when all 8 maps are done. The reduce will never finish.
All 2 nodes are sending heartbeats every 10 secs, nobody is dying.
This is what WebUI showes for the bad task-tracker
Running tasks
Task Attempts Status Progress Errors
task_0001_r_000000_1 RUNNING 16.66%
Non-Running Tasks
Task Attempts Status
task_0001_m_000004_0 SUCCEEDED
task_0001_m_000007_0 SUCCEEDED
task_0001_m_000003_0 SUCCEEDED
task_0001_m_000006_0 SUCCEEDED
I put a breakpoint in org.apache.hadoop.ipc.Server.Handler.run() where the
calls are proccessed, at
value = call(call.param); // make the call
I see it is processing only the following 3 calls.
ping(task_0001_r_000000_1) from 66.22.15.15:58122
progress(task_0001_r_000000_1, 0.16666667, reduce > copy (4 of 8 at 0.00 MB/s)
> , SHUFFLE, [EMAIL PROTECTED]) from 66.22.15.15:58122
getMapCompletionEvents(job_0001, 8, 50) from 66.22.15.15:58122
Hope this helps.
> TaskTracker falls into an infinite loop.
> ----------------------------------------
>
> Key: HADOOP-1374
> URL: https://issues.apache.org/jira/browse/HADOOP-1374
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.3
> Reporter: Konstantin Shvachko
> Assigned To: Arun C Murthy
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: DataNode1.log, DataNode2.log, JobTracker.log,
> NameNode.log, TaskTracker1.log, TaskTracker2.log, TestDFSIO.log
>
>
> All maps had been completed successfully. I had only one reduce task during
> which
> TaskTracker infinitely outputs:
> 07/05/15 19:35:41 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667%
> reduce > copy (4 of 8 at 0.00 MB/s) >
> 07/05/15 19:35:42 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667%
> reduce > copy (4 of 8 at 0.00 MB/s) >
> 07/05/15 19:35:43 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667%
> reduce > copy (4 of 8 at 0.00 MB/s) >
> 07/05/15 19:35:44 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667%
> reduce > copy (4 of 8 at 0.00 MB/s) >
> 07/05/15 19:35:45 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667%
> reduce > copy (4 of 8 at 0.00 MB/s) >
> JobTracker does not log anything about task task_0001_r_000000_0 except for
> 07/05/15 19:49:01 INFO mapred.JobTracker: Adding task 'task_0001_r_000000_0'
> to tip tip_0001_r_000000, for tracker 'tracker_my-host.com:50050'
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.