[ 
https://issues.apache.org/jira/browse/HADOOP-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675920#action_12675920
 ] 

Hudson commented on HADOOP-5285:
--------------------------------

Integrated in Hadoop-trunk #763 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/763/])
    . Adding a file that I missed in my earlier commit.
. Fixes the issues - (1) obtainTaskCleanupTask checks whether job is inited 
before trying to lock the JobInProgress (2) Moves the CleanupQueue class 
outside the TaskTracker and makes it a generic class that is used by the 
JobTracker also for deleting the paths on the job's output fs. (3) Moves the 
references to completedJobStore outside the block where the JobTracker is 
locked. Contributed by Devaraj Das.


> JobTracker hangs for long periods of time
> -----------------------------------------
>
>                 Key: HADOOP-5285
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5285
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Vinod K V
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0, 0.21.0
>
>         Attachments: 5285.1.patch, 5285.patch, trace.txt
>
>
> On one of the larger clusters of 2000 nodes, JT hanged quite often, sometimes 
> for times in the order of 10-15 minutes and once for one and a half hours(!). 
> The stack trace shows that JobInProgress.obtainTaskCleanupTask() is waiting 
> for lock on JobInProgress object which JobInProgress.initTasks() is holding 
> for a long time waiting for DFS operations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to