Spurious EOFExceptions reading SpillRecord index files
------------------------------------------------------
Key: MAPREDUCE-2389
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2389
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: tasktracker
Affects Versions: 0.22.0
Environment: Seen on RHEL 5.5, RHEL 6.0, local dirs on ext3, Java 6u20
and 6u24
Reporter: Todd Lipcon
Priority: Critical
In large jobs, I see around 1 shuffle fetch out of every million fetches fail
with an EOFException reading the SpillRecord index file. After lots of
investigation, including systemtap, it looks like the read() syscall is
actually returning a premature "0" result for no reason, so this is likely a
kernel or filesystem bug which is exacerbated by some workload the TT does.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira