[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client

Jason Lowe (JIRA) Thu, 29 Nov 2012 09:42:59 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506616#comment-13506616
 ]


Jason Lowe commented on MAPREDUCE-4819:
---------------------------------------

bq. As far as YARN-244 is concerned the comments around the code seem to 
suggest that it was an explicit decision to cleanup before unregistering. 

Yes, it was explicitly done as a workaround to ensure the staging directory was 
cleaned up before the RM shot the AM.  Previously there was a race where the AM 
was trying to delete the staging directory while the RM was shooting the AM 
after it unregistered and often the AM lost and the staging directory was left 
around.  Since then we've added a FINISHING state to allow the AM to cleanup 
before the RM tries to shoot it.  Given that, we should move the staging 
directory cleanup back to after we unregister (but not in this JIRA).

bq. I am not quite clear why the commit would be repeated if the job does not 
execute any task at all?

The job will only avoid committing if it sees the job completion event written 
to the history file, but that occurs *after* committing.  Therefore if we 
commit then crash before we sync that completion event to disk, the second 
attempt will try to commit again.  And we're seeing a number of cases where the 
AM crashed after committing but before completing job history.  It should be 
relatively rare, but it can happen and is happening.

bq. the commit code seems to be user pluggable code. In that case, how can we 
ensure that every commit implementation can be made into a singleton operation? 
Can it be as simple as a committer refusing to commit if the output file 
already exists? Are committers allowed to delete an output file if it exists? 
In that case how does it differentiate between a checkpointed commit from a 
previous crashed run vs an old commit from a successful job?

The committer is user-pluggable code and therefore can do *arbitrary* things.  
It doesn't have to be files.  It can be a database commit, an web service 
transaction, a custom job-end notification mechanism, or whatever.  Therefore 
we cannot assume the commit is recoverable -- there are reasons why the job 
fails when the committer says it failed, because we can't retry it.  In the 
future maybe we can extend the committer API to allow the committer to say it 
can attempt to recover from a job commit failure, but for now we can't tell.  
That's why re-running a commit is Not Good.

bq. On a side note, we should be encouraging projects that depend on output 
markers for job completion polling, to stop doing that and start using API's. 
Perhaps in the next version change. Continuing to support these kind of use 
cases could make solutions more complex and fragile than they need to be.

The file marker thing is just one committer's way of handling things.  
Committers can do *arbitrary* things.  The job doesn't even have to produce 
output as files, for example.  It's pluggable for a reason, and we can't know 
or assume what it's doing.  We can only give it interfaces and restrictions 
(hopefully as few as possible) to govern how it interoperates with the rest of 
the job framework.
                
> AM can rerun job after reporting final job status to the client
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-4819
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 0.23.3, 2.0.1-alpha
>            Reporter: Jason Lowe
>            Assignee: Bikas Saha
>            Priority: Critical
>         Attachments: MAPREDUCE-4819.1.patch, MAPREDUCE-4819.2.patch
>
>
> If the AM reports final job status to the client but then crashes before 
> unregistering with the RM then the RM can run another AM attempt.  Currently 
> AM re-attempts assume that the previous attempts did not reach a final job 
> state, and that causes the job to rerun (from scratch, if the output format 
> doesn't support recovery).
> Re-running the job when we've already told the client the final status of the 
> job is bad for a number of reasons.  If the job failed, it's confusing at 
> best since the client was already told the job failed but the subsequent 
> attempt could succeed.  If the job succeeded there could be data loss, as a 
> subsequent job launched by the client tries to consume the job's output as 
> input just as the re-attempt starts removing output files in preparation for 
> the output commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client

Reply via email to