[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214751#comment-14214751
 ] 

Junping Du commented on MAPREDUCE-6164:
---------------------------------------

Deliver a quick patch to fix it.

> "mapreduce.reduce.shuffle.fetch.retry.timeout-ms" should be set to 3 minutes 
> instead of 30 seconds by default to be consistent with other retry timeout 
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6164
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6164
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: MAPREDUCE-6164.patch
>
>
> In MAPREDUCE-5891, we are adding retry logic to MAPREDUCE shuffle stage for 
> fetcher can be survival during NM downtime (with shuffle service down as 
> well). In many places, we are setting the default timeout to be 3 minutes 
> (connection timeout, etc.) to tolerant possible more time for NM down, but we 
> are making "mapreduce.reduce.shuffle.fetch.retry.timeout-ms" to be 30 seconds 
> which is not consistent here. We should change this to 180 seconds. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to