[
https://issues.apache.org/jira/browse/MAPREDUCE-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366138#comment-14366138
]
Jason Lowe commented on MAPREDUCE-6275:
---------------------------------------
Thanks for working on the patch, Gera and Siqi!
I'm a little concerned about this portion of the patch:
{noformat}
+ if (from.isFile()) {
+ if (toStat != null) {
+ if (!fs.delete(to, true)) {
+ throw new IOException("Failed to delete " + to);
+ }
+ }
{noformat}
If we're trying to commit a file and there's a directory with the same name
there, do we really want to recursively delete that directory? Seems to me
something is terribly amiss with the output (buggy output committer code?) and
we should fail here rather than delete something we don't really understand.
Am I missing a valid use-case? There's going to be some issues if a reducer
wants to commit a file to that path while another reducer who wants to commit a
directory to that path is still committing.
Nit: Not introduced by this change, but the test should use
System.getProperty(java.io.tmpdir) instead of a hardcoded "/tmp". Also
"output" isn't a very unique pathname in light of unit tests running in
parallel, and I'd recommend using the qualified classname as a prefix or just
as-is to avoid collisions.
Otherwise patch looks good.
> Race condition in FileOutputCommitter v2 for user-specified task output
> subdirs
> -------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6275
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6275
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Siqi Li
> Assignee: Gera Shegalov
> Priority: Critical
> Attachments: MAPREDUCE-6275.002.patch, MAPREDUCE-6275.v1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)