[jira] Commented: (HADOOP-4246) Reduce task copy errors may not kill it eventually

Jothi Padmanabhan (JIRA) Mon, 29 Sep 2008 23:38:09 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635684#action_12635684
 ]


Jothi Padmanabhan commented on HADOOP-4246:
-------------------------------------------

Having a lower bound of 1 on map-fetch-retries might not be efficient as it 
opens the possibility for map reexecution on transient errors -- three 
different reducers reporting error when the serving task tracker had a 
transient problem could lead to re-execution of the map. We probably should try 
at least twice.

> Reduce task copy errors may not kill it eventually
> --------------------------------------------------
>
>                 Key: HADOOP-4246
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4246
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4246.txt, patch-4246.txt
>
>
> maxFetchRetriesPerMap in reduce task can be zero some times (when 
> maxMapRunTime is less than 4 seconds or mapred.reduce.copy.backoff is less 
> than 4). This will not count reduce task copy errors to kill it eventually.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4246) Reduce task copy errors may not kill it eventually

Reply via email to