[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428975#comment-15428975
 ] 

Chen He commented on HADOOP-9565:
---------------------------------

>From our experiences, the main renaming overhead comes from 
>"FileOutputCommitter.commitTask()". Because it moves the files from temp dir 
>to dest dir. Some frameworks may not care whether the final task files are 
>under "dst/_temporary/0/_temporary/" or "dst/". Why don't we add a parameter 
>such as "mapreduce.skip.task.commit" parameter (default is false), so that 
>once a task is done, the output just stay in "dst/_temporary/0/_temporary/". 
>Then, the next job or application just need to take the "dst/" as input dir, 
>they do not care about whether is is deep or not. It avoids the atomicwrite 
>issue, provide compatibility, and avoid rename overhead. If there is no 
>objection, I will create a JIRA to tracking that.

> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
>                 Key: HADOOP-9565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9565
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, fs/s3, fs/swift
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>            Assignee: Pieter Reuse
>         Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch, 
> HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to