[ 
https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566309#action_12566309
 ] 

Arun C Murthy commented on HADOOP-2119:
---------------------------------------

bq. I think it makes most sense to go with Option 1 for now, as it's the 
easiest to implement and makes the most common case run much faster. Options 3 
and 4 need a fair bit of refactoring and may be an overkill for now, since you 
can get the most bang for the buck by just making sure that you don't scan the 
array from the beginning for virgin tasks.

Vivek, it's a fair analysis and I agree it will help in the short-run.

However, I do believe this is a good time to start thinking about a better 
overall approach - especially given that HADOOP-1985 (rack-aware Map-Reduce 
scheduling) is almost upon us ...

I've had a brief chat with Owen about this and we both seem to have different 
approaches - I'll try and put up my thoughts about a completely revamped design 
for the scheduling data-structures in the next few days for consideration.

> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-2119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2119
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
>
>
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap 
> space limit).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to