[ https://issues.apache.org/jira/browse/HADOOP-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512206 ]
Arun C Murthy commented on HADOOP-1558: --------------------------------------- I firmly am on Doug's side of the fence about the need keep the kernel free of user-code, however w.r.t to this issue I'd like to bring some complications to everyone's attention: Eventually we need to move {{Task.saveOutput}} and {{Task.discardOutput}} to the OutputFormats; however this means that we *have* to call these methods from the {{JobTracker}} (only there do we have a global picture of the tasks and the job), there-by ruling out these being done at the child-jvms since I believe doing this isn't feasible performance-wise (I'd love to hear thoughts/arguments/ideas); hence I'd agree with Alejandro's take on static output-file handlers which cannot be user-supplied or user-overridden for now. > changes to OutputFormat to work on temporary directory to enable re-running > crashed jobs (Issue: 1121) > ------------------------------------------------------------------------------------------------------ > > Key: HADOOP-1558 > URL: https://issues.apache.org/jira/browse/HADOOP-1558 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Environment: all > Reporter: Alejandro Abdelnur > Fix For: 0.14.0 > > Attachments: hadoop-1558-JUN1007-1934.txt, > hadoop-1558-JUN1107-1533.txt > > > Add OutputFormat methods like: > /** Called to initialize output for this job. */ > void initialize(JobConf job) throws IOException; > /** Called to finalize output for this job. */ > void commit(JobConf job) throws IOException; > In the base implemenation for FileSystem output, initialize() might then > create a temporary directory for the job, removing any that already exists, > and commit could rename the temporary output directory to the final name. > The existing checkOutputSpecs() would continue to throw an exception if the > final output already exists. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.