[
https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241525#comment-14241525
]
Dave Marion commented on MAPREDUCE-4815:
----------------------------------------
I think we might be seeing a side effect of patch #8. What we are seeing is an
output directory being created underneath the location where it should be. For
example, if we expect files in dir1/dir2 there are times when we see
/dir1/dir2/dir2. I think the problem stems from the call to mergePaths now
being called from commitTask, and there is a race condition when two tasks
complete at the same time. Specifically, its the last case in mergePaths when
'from' does not exist, so it calls rename.
I traced this, hopefully correctly, to FSNamesystem.renameToInternal() which
has a nasty comment about doing something that it shouldn't. It also appears to
create dir1/dir2/dir2. I think this is a bug in FSNamesystem. For example if
from = /pathA/dir1/dir2
to = /pathB/dir1/dir2
What happens when two processes call fs.rename(from,to) at the same time?
> FileOutputCommitter.commitJob can be very slow for jobs with many output files
> ------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4815
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.3, 2.0.1-alpha, 2.4.1
> Reporter: Jason Lowe
> Assignee: Siqi Li
> Attachments: MAPREDUCE-4815.v3.patch, MAPREDUCE-4815.v4.patch,
> MAPREDUCE-4815.v5.patch, MAPREDUCE-4815.v6.patch, MAPREDUCE-4815.v7.patch,
> MAPREDUCE-4815.v8.patch
>
>
> If a job generates many files to commit then the commitJob method call at the
> end of the job can take minutes. This is a performance regression from 1.x,
> as 1.x had the tasks commit directly to the final output directory as they
> were completing and commitJob had very little to do. The commit work was
> processed in parallel and overlapped the processing of outstanding tasks. In
> 0.23/2.x, the commit is single-threaded and waits until all tasks have
> completed before commencing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)