[
https://issues.apache.org/jira/browse/HADOOP-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606485#action_12606485
]
Doug Cutting commented on HADOOP-3598:
--------------------------------------
If back-compatibility is your concern, then leave getWorkOutputPath() alone and
add a new protected method createWorkOutputPath() or somesuch that creates the
directory and returns the path within it. Duplicating the same four lines in
every subclass is not good, especially when these four lines check the
existence of the same directory twice.
> Map-Reduce framework needlessly creates temporary _${taskid} directories for
> Maps
> ---------------------------------------------------------------------------------
>
> Key: HADOOP-3598
> URL: https://issues.apache.org/jira/browse/HADOOP-3598
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: HADOOP-3598_0_20080619.patch
>
>
> The staging directory for task-outputs (i.e.
> ${mapred.out.dir}/_temporary/_${taskid}) should only be created when Maps
> produce output on HDFS, which usually isn't the case. This plays very badly
> with HDFS quotas and may lead to thousands of temp names in the FS namespace,
> there-by overhauling the quotas. IAC, it isn't good to needlessly create
> these directories.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.