[
https://issues.apache.org/jira/browse/HADOOP-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595989#action_12595989
]
Amareshwari Sriramadasu commented on HADOOP-2759:
-------------------------------------------------
Here is a usecase : Creation of side-files by the task:
Every running job has a temporary output directory
${mapred.output.dir}/\_temporary, which is deleted at the completion of the
job. And a job can be declared SUCCESSFUL before killing the speculative tasks.
If the speculative task is creating a side file after the deletion of
\_temporary, since _create_ api creates the parent directories, \_temporary
will be created again in output directory. So, there is possibility that
\_temporary directory will be present in the output directory read by a chained
job.
More discussion is available at
http://issues.apache.org/jira/browse/HADOOP-2391?focusedCommentId=12566183#action_12566183
and
http://issues.apache.org/jira/browse/HADOOP-2391?focusedCommentId=12566761#action_12566761
> creating a file in hdfs should not automatically create the parent directories
> ------------------------------------------------------------------------------
>
> Key: HADOOP-2759
> URL: https://issues.apache.org/jira/browse/HADOOP-2759
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.17.0
> Reporter: Owen O'Malley
> Assignee: Pi Song
> Fix For: 0.18.0
>
> Attachments: hadoop-2759-complete1.patch, HADOOP-2759_1.patch,
> hadoop_tmp.patch
>
>
> I think it would be better if HDFS didn't automatically create directories
> for the user. In particular, in clean up code, it would be nice if deleting a
> directory couldn't be undone by mistake by a process that hasn't been killed
> yet creating a new file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.