[ 
https://issues.apache.org/jira/browse/HADOOP-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491774
 ] 

Doug Cutting commented on HADOOP-1270:
--------------------------------------

This looks reasonable to me.  Have you yet tested whether it improves 
performance?

> Randomize the fetch of map outputs
> ----------------------------------
>
>                 Key: HADOOP-1270
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1270
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1270_20070425_1.patch
>
>
> HADOOP-248 did away with random probing of maps for locating map outputs and 
> instead we now rely on TaskCompletionEvents for the same. 
> However we lost out on the benefit that the randomization in probing resulted 
> in an added benefit where the map's jetty isn't overloaded with requests for 
> the outputs. We have now a situation where a map completes, the JT is 
> notified, *all* the reduces get the TaskCompletionEvent and pretty much swamp 
> the poor map's jetty and this repeats for each map.
> I propose we make a minor change where we collect a set of 
> TaskCompletionEvents and randomize the list before firing the fetches. Should 
> help fix this mass-hysteria at the map's jetty.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to