[ 
https://issues.apache.org/jira/browse/HADOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615154#action_12615154
 ] 

Owen O'Malley commented on HADOOP-3150:
---------------------------------------

I think you may be right that we want to have an OutputCommitter, but it should 
*not* be determined by the OutputFormat. Rather it should be configured 
independently. In particular, we can default it to "FileOutputCommitter" with 
roughly the current semantics. My concern with having the OutputFormat create 
the OutputCommitter is it makes the api more complex and the application may 
want to write side files with a non file output format.

I'd propose something like:
{code}
public abstract class OutputCommitter {
  public abstract void setupJob(JobContext context) throws IOException;
  public abstract void commitJob(JobContext context) throws IOException;
  public abstract void abortJob(JobContext context) throws IOException;
  public abstract void setupTask(TaskAttemptContext context) throws IOException;
  public abstract boolean needsTaskCommit(TaskAttemptContext context) throws 
IOException;
  public abstract void commitTask(TaskAttemptContext context) throws 
IOException;
  public abstract void abortTask(TaskAttemptContext context) throws IOException;
}

public class FileOutputCommitter extends OutputCommitter {
  public Path getWorkPath(Path basePath) throws IOException;
}

public class JobConf {
  public OutputCommitter getOutputCommitter();
}
{code}

We need the test for needing commit to optimize the very typical case where 
there is nothing to commit and thus no point to a round trip from the 
JobTracker. The FileOutputFormat would check if the OutputCommitter is a 
FileOutputCommitter and if so, it would use the getWorkPath from it.

Thoughts?

> Move task file promotion into the task
> --------------------------------------
>
>                 Key: HADOOP-3150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3150
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.19.0
>
>         Attachments: 3150.patch, patch-3150.txt, patch-3150.txt
>
>
> We need to move the task file promotion from the JobTracker to the Task and 
> move it down into the output format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to