[ 
https://issues.apache.org/jira/browse/HADOOP-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HADOOP-1105:
--------------------------------

    Attachment: 1105.patch

This patch makes some fields volatile (the fields that reportProgress uses, 
since reportProgress happens in a separate thread), instead of making the 
"synchronized" call at the ReduceTask. The main change here (in the patches) is 
that reportProgress is not done as part of every invocation of reducer(key, 
value[]); instead, the progress field is just set, and the thread does the 
actual job of reporting those to the tasktracker. This saves the overhead of 
making RPC connections to a overloaded/slow TaskTracker inline with the reducer 
method invocations. Ditto for Reporter.setStatus call (in Task.java). It should 
improve the reducer performance overall.


> Reducers don't make "progress" while iterating through values
> -------------------------------------------------------------
>
>                 Key: HADOOP-1105
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1105
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.12.3
>
>         Attachments: 1105.patch, 1105.patch
>
>
> Reduces make progress when they go to a new key, but not when they read the 
> next value, which could cause reduces to time out when they have a lot of 
> values for the same key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to