[jira] Commented: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast

Vivek Ratan (JIRA) Wed, 06 Feb 2008 20:45:32 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566446#action_12566446
 ]


Vivek Ratan commented on HADOOP-2119:
-------------------------------------


>> However, I do believe this is a good time to start thinking about a better 
>> overall approach - especially given that HADOOP-1985 (rack-aware Map-Reduce 
>> scheduling) is almost upon us ..

I don't think Approaches 1 and 2 are necessarily short-term. At least that's 
not the way we've been thinking about them (i.e., my recommendations are not 
based on short-term or long-term considerations). My view is that the first two 
approaches are simple enough and probably solve most of the problem so it's 
worth trying them out and then measuring performance. The other approaches 
require much more coding and the performance gains may not be worth the added 
complexity, even in the longer term. But this we can determine if we implement 
1 or 2 and then see where the bottlenecks are and how often they happen. 

Also, we've gone through all these approaches with the rack-awareness 
implementation in mind. They handle rack-awareness just fine. 

You should definitely put down your design as well. 

> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-2119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2119
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
>
>
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap 
> space limit).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast

Reply via email to