[
https://issues.apache.org/jira/browse/MAPREDUCE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531436#comment-14531436
]
Jason Lowe commented on MAPREDUCE-6336:
---------------------------------------
I'm +1 for trunk, assuming there hasn't been any reports of bad breakage from
those who have enabled it so far.
Only concern for enabling for branch-2 is it does change the semantics of jobs
that fail in a catastrophic way (e.g.: AM crashes and exhausts retries or
killed via YARN and could not cleanup). In the old scheme there would normally
not be any files at the top-level of the output directory, only a temporary
directory holding the attempts. In the new scheme it is very likely there can
be partial output (i.e.: some files have been committed to final destination,
others have not). Downstream jobs in workflows would need to check via other
means (e.g.: _SUCCESS file, query RM/JHS for job status, etc.) to verify the
output is complete.
> Enable v2 FileOutputCommitter by default
> ----------------------------------------
>
> Key: MAPREDUCE-6336
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6336
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 2.7.0
> Reporter: Gera Shegalov
> Assignee: Siqi Li
> Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-6336.v1.patch
>
>
> This JIRA is to propose making new FileOutputCommitter behavior from
> MAPREDUCE-4815 enabled by default in trunk, and potentially in branch-2.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)