[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106250#comment-14106250
 ] 

Ming Ma commented on MAPREDUCE-4815:
------------------------------------

To have first task's recoverTask recover all succeeded tasks seems to work 
functionality wise. If the first task fails to recoverTask due to fs.rename 
exception, it will be rescheduled; the second task's recoverTask can continue 
to recover the succeeded tasks.

It does change the semantics of recoverTask. It is no longer done on per task 
basis. But perhaps we can treat it as an optimization; other OutputCommitter 
implementations can still choose to have recovery on per task basis.

For the upgrade scenario, how does it clean up the succeeded task attempt data 
in the old scheme?

> FileOutputCommitter.commitJob can be very slow for jobs with many output files
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4815
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.3, 2.0.1-alpha, 2.4.1
>            Reporter: Jason Lowe
>            Assignee: Siqi Li
>         Attachments: MAPREDUCE-4815.v3.patch, MAPREDUCE-4815.v4.patch, 
> MAPREDUCE-4815.v5.patch
>
>
> If a job generates many files to commit then the commitJob method call at the 
> end of the job can take minutes.  This is a performance regression from 1.x, 
> as 1.x had the tasks commit directly to the final output directory as they 
> were completing and commitJob had very little to do.  The commit work was 
> processed in parallel and overlapped the processing of outstanding tasks.  In 
> 0.23/2.x, the commit is single-threaded and waits until all tasks have 
> completed before commencing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to