[
https://issues.apache.org/jira/browse/MAPREDUCE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kumar Vavilapalli updated MAPREDUCE-6303:
-----------------------------------------------
Fix Version/s: 2.6.1
Pulled this into 2.6.1. Ran compilation and TestFetcher before the push. Patch
applied cleanly.
> Read timeout when retrying a fetch error can be fatal to a reducer
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-6303
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Blocker
> Labels: 2.6.1-candidate
> Fix For: 2.7.0, 2.6.1
>
> Attachments: MAPREDUCE-6303.001.patch
>
>
> If a reducer encounters an error trying to fetch from a node then encounters
> a read timeout when trying to re-establish the connection then the reducer
> can fail. The read timeout exception can leak to the top of the Fetcher
> thread which will cause the reduce task to teardown. This type of error can
> repeat across reducer attempts causing jobs to fail due to a single bad node.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)