[jira] [Commented] (MAPREDUCE-6251) JobClient needs additional retries at a higher level to address not-immediately-consistent dfs corner cases

Vinod Kumar Vavilapalli (JIRA) Wed, 06 May 2015 10:22:33 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530973#comment-14530973
 ]


Vinod Kumar Vavilapalli commented on MAPREDUCE-6251:
----------------------------------------------------

bq. Document them in mapred-default.xml? Stating when they are needed, and how 
they should be used in contrast to the lower level retries.
Missed? It'll be useful to note that it is really only useful when dealing 
other file-systems.

> JobClient needs additional retries at a higher level to address 
> not-immediately-consistent dfs corner cases
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6251
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6251
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver, mrv2
>    Affects Versions: 2.6.0
>            Reporter: Craig Welch
>            Assignee: Craig Welch
>              Labels: BB2015-05-TBR
>         Attachments: MAPREDUCE-6251.0.patch, MAPREDUCE-6251.1.patch, 
> MAPREDUCE-6251.2.patch, MAPREDUCE-6251.3.patch, MAPREDUCE-6251.4.patch
>
>
> The JobClient is used to get job status information for running and completed 
> jobs.  Final state and history for a job is communicated from the application 
> master to the job history server via a distributed file system - where the 
> history is uploaded by the application master to the dfs and then 
> scanned/loaded by the jobhistory server.  While HDFS has strong consistency 
> guarantees not all Hadoop DFS's do.  When used in conjunction with a 
> distributed file system which does not have this guarantee there will be 
> cases where the history server may not see an uploaded file, resulting in the 
> dreaded "no such job" and a null value for the RunningJob in the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6251) JobClient needs additional retries at a higher level to address not-immediately-consistent dfs corner cases

Reply via email to