Craig Welch created MAPREDUCE-6251:
--------------------------------------
Summary: JobClient needs additional retries at a higher level to
address not-immediately-consistent dfs corner cases
Key: MAPREDUCE-6251
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6251
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: jobhistoryserver
Affects Versions: 2.2.0
Reporter: Craig Welch
Assignee: Craig Welch
The JobClient is used to get job status information for running and completed
jobs. Final state and history for a job is communicated from the application
master to the job history server via a distributed file system - where the
history is uploaded by the application master to the dfs and then
scanned/loaded by the jobhistory server. While HDFS has strong consistency
guarantees not all Hadoop DFS's do. When used in conjunction with a
distributed file system which does not have this guarantee there will be cases
where the history server may not see an uploaded file, resulting in the dreaded
"no such job" and a null value for the RunningJob in the client.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)