[
https://issues.apache.org/jira/browse/MAPREDUCE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687856#comment-13687856
]
Ravi Prakash commented on MAPREDUCE-5317:
-----------------------------------------
Thanks a lot for your detailed review Jason. Thanks also for summarizing all
the problems.
Daryn and Kihwal also pointed out to me that another problem with using
non-recursive create is that the existing behavior of some OutputCommitters may
be to depend on the recursive create in setupTask. I agree with you, that we
should use non-recursive create but can we please pursue that in another JIRA
since it changes pre-existing behavior?
This patch simply provides a best-effort guarantee that abortJob will be called
after all the tasks are finished/killed. I think that is a reasonable
expectation on the part of our users who choose to write their own
OutputCommitters. In the case that the timeout expires, we may still be left
with stale files if some tasks are still alive and create the directory, but I
feel that is a bullet we have to bite.
I've incorporated all your comments. I took the liberty of renaming FAIL_ABORT
to FAIL_WAIT because it made a lot more sense to me. Thanks a lot again! :)
> Stale files left behind for failed jobs
> ---------------------------------------
>
> Key: MAPREDUCE-5317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5317
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.8
> Reporter: Ravi Prakash
> Assignee: Ravi Prakash
> Attachments: MAPREDUCE-5317.patch, MAPREDUCE-5317.patch
>
>
> Courtesy [~amar_kamat]!
> {quote}
> We are seeing _temporary files left behind in the output folder if the job
> fails.
> The job were failed due to hitting quota issue.
> I simply ran the randomwriter (from hadoop examples) with the default setting.
> That failed and left behind some stray files.
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira