[ 
https://issues.apache.org/jira/browse/HADOOP-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665705#action_12665705
 ] 

Jothi Padmanabhan commented on HADOOP-3327:
-------------------------------------------

Looks good. A few points
* Some comments on the changes in the code would be good.
* The percentages that we use to decide maxNotifications and fetchRetriesPerMap 
should be configurable?
* Since fetchRetriesPerMap is computed during every iteration as per the 
current copiedMapOutputs.size, it is possible that we might delay a 
notification to the JT by one failure. For example, consider 
maxFetchRetriesPerMap = 5 and numRetries=4. During the next failure numRetries 
= 5, and let us say we cross the threshold and reset fetchRetriesperMap = 2 
(5/2). As per the existing logic, we would have sent a notification as  
numRetires = maxFetchRetriesPerMap. But with the new logic, we will wait as 5%2 
!= 0. But this is a corner case and probably can be overlooked.

> Shuffling fetchers waited too long between map output fetch re-tries
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3327
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3327
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: hadoop-3327-v1.patch, hadoop-3327-v2.patch, 
> hadoop-3327-v3.patch, hadoop-3327.patch, patch-3327.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to