Jason Lowe created MAPREDUCE-6303:
-------------------------------------
Summary: Read timeout when retrying a fetch error can be fatal to
a reducer
Key: MAPREDUCE-6303
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Priority: Blocker
If a reducer encounters an error trying to fetch from a node then encounters a
read timeout when trying to re-establish the connection then the reducer can
fail. The read timeout exception can leak to the top of the Fetcher thread
which will cause the reduce task to teardown. This type of error can repeat
across reducer attempts causing jobs to fail due to a single bad node.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)