[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927024#action_12927024
 ] 

Owen O'Malley commented on MAPREDUCE-2168:
------------------------------------------

I misread your problem. A single reduce won't slam a single node, but the 
entire set of reduces will. The new shuffle does a better job of backing off 
from the shuffle, but the fundamental problem is that in order to know which 
map a given connection is looking for you, the code needs to accept it. Once it 
has accepted it, it is better to service the request rather than put it back on 
the queue.

You might try upgrading the version of Jetty. The version of jetty that we are 
currently using has some over aggressive locking that leads to 
under-utilization. See HADOOP-6882.

> We should  implement limits on shuffle connections to TaskTracker per job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2168
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2168
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Liyin Liang
>
> As trailing map tasks will be attacked by all reduces simultaneously, all the 
> worker threads that for the http server of a TaskTracker may be occupied  by 
> one job's reduce tasks to fetch map outputs. Then this tasktracker's iowait 
> and load will be very high (100+ in our cluster, we set 
> tasktracker.http.threads with 100). What's more, other job's reduces have to 
> wait some time (may be several minutes) to connect to the TaskTracker to 
> fetch there map's outputs.
> So I think we should implement limits on shuffle connections:
> 1. limit the worker threads' number maybe percent  occupied  the same job's 
> reduces ;
> 2. limit the worker threads' number serving the same map output 
> simultaneously.
> Thoughts? 
> ps: we are using hadoop 0.19.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to