[ 
https://issues.apache.org/jira/browse/HADOOP-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499270
 ] 

Konstantin Shvachko commented on HADOOP-1374:
---------------------------------------------

Debugging TaskTracker being in the loop on Linux.
It falls into loop when all 8 maps are done. The reduce will never finish.
All 2 nodes are sending heartbeats every 10 secs, nobody is dying.
This is what WebUI showes for the bad task-tracker

Running tasks
Task Attempts   Status  Progress        Errors
task_0001_r_000000_1    RUNNING 16.66%  

Non-Running Tasks
Task Attempts   Status
task_0001_m_000004_0    SUCCEEDED
task_0001_m_000007_0    SUCCEEDED
task_0001_m_000003_0    SUCCEEDED
task_0001_m_000006_0    SUCCEEDED

I put a breakpoint in org.apache.hadoop.ipc.Server.Handler.run() where the 
calls are proccessed, at
value = call(call.param);             // make the call

I see it is processing only the following 3 calls.

ping(task_0001_r_000000_1) from 66.22.15.15:58122
progress(task_0001_r_000000_1, 0.16666667, reduce > copy (4 of 8 at 0.00 MB/s) 
> , SHUFFLE, [EMAIL PROTECTED]) from 66.22.15.15:58122
getMapCompletionEvents(job_0001, 8, 50) from 66.22.15.15:58122

Hope this helps.

> TaskTracker falls into an infinite loop.
> ----------------------------------------
>
>                 Key: HADOOP-1374
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1374
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Konstantin Shvachko
>         Assigned To: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.13.0
>
>         Attachments: DataNode1.log, DataNode2.log, JobTracker.log, 
> NameNode.log, TaskTracker1.log, TaskTracker2.log, TestDFSIO.log
>
>
> All maps had been completed successfully. I had only one reduce task during 
> which
> TaskTracker infinitely outputs:
> 07/05/15 19:35:41 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% 
> reduce > copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:42 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% 
> reduce > copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:43 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% 
> reduce > copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:44 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% 
> reduce > copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:45 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% 
> reduce > copy (4 of 8 at 0.00 MB/s) > 
> JobTracker does not log anything about task task_0001_r_000000_0 except for
> 07/05/15 19:49:01 INFO mapred.JobTracker: Adding task 'task_0001_r_000000_0' 
> to tip tip_0001_r_000000, for tracker 'tracker_my-host.com:50050'

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to