Randomize the fetch of map outputs
----------------------------------

                 Key: HADOOP-1270
                 URL: https://issues.apache.org/jira/browse/HADOOP-1270
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: Arun C Murthy
             Fix For: 0.13.0


HADOOP-248 did away with random probing of maps for locating map outputs and 
instead we now rely on TaskCompletionEvents for the same. 

However we lost out on the benefit that the randomization in probing resulted 
in an added benefit where the map's jetty isn't overloaded with requests for 
the outputs. We have now a situation where a map completes, the JT is notified, 
*all* the reduces get the TaskCompletionEvent and pretty much swamp the poor 
map's jetty and this repeats for each map.

I propose we make a minor change where we collect a set of TaskCompletionEvents 
and randomize the list before firing the fetches. Should help fix this 
mass-hysteria at the map's jetty.

Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to