[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259603#comment-13259603
 ] 

Robert Joseph Evans commented on MAPREDUCE-1471:
------------------------------------------------

Jim, it should not be too difficult for you to take the existing 
FileOutputCommitter and modify it to do what you want.  Be aware though that 
going to the 2.0 line FileOutputCommitter has changed so that if an application 
master crashes it can recover and start over without needing to rerun anything 
that finished successfully before.  Just be aware that the directory structure 
is different and when upmerging to trunk/2.0 you will probably need to modify 
your code.
                
> FileOutputCommitter does not safely clean up it's temporary files
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1471
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1471
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Jim Finnessy
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> When the FileOutputCommitter cleans up during it's cleanupJob method, it 
> potentially deletes the temporary files of other concurrent jobs.
> Since all the temporary files for all concurrent jobs are written to 
> working_path/_temporary/ any concurrent tasks that have the same working_path 
> will remove all currently executing jobs when it removes 
> working_path/_temporary during job cleanup.
> If the file name output is guaranteed by the client application to be unique, 
> the temporary files/directories should also be guaranteed to be unique to 
> avoid this problem. Suggest modifying cleanupJob to only remove files that it 
> created itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to