[ https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428975#comment-15428975 ]
Chen He commented on HADOOP-9565: --------------------------------- >From our experiences, the main renaming overhead comes from >"FileOutputCommitter.commitTask()". Because it moves the files from temp dir >to dest dir. Some frameworks may not care whether the final task files are >under "dst/_temporary/0/_temporary/" or "dst/". Why don't we add a parameter >such as "mapreduce.skip.task.commit" parameter (default is false), so that >once a task is done, the output just stay in "dst/_temporary/0/_temporary/". >Then, the next job or application just need to take the "dst/" as input dir, >they do not care about whether is is deep or not. It avoids the atomicwrite >issue, provide compatibility, and avoid rename overhead. If there is no >objection, I will create a JIRA to tracking that. > Add a Blobstore interface to add to blobstore FileSystems > --------------------------------------------------------- > > Key: HADOOP-9565 > URL: https://issues.apache.org/jira/browse/HADOOP-9565 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, fs/s3, fs/swift > Affects Versions: 2.6.0 > Reporter: Steve Loughran > Assignee: Pieter Reuse > Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, > HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch, > HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch > > > We can make the fact that some {{FileSystem}} implementations are really > blobstores, with different atomicity and consistency guarantees, by adding a > {{Blobstore}} interface to add to them. > This could also be a place to add a {{Copy(Path,Path)}} method, assuming that > all blobstores implement at server-side copy operation as a substitute for > rename. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org