[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198324#comment-15198324
 ] 

Aaron Tokhy commented on HADOOP-9565:
-------------------------------------

FileOutputCommitter recently introduced a notion of 'algorithm version' 
(through mapreduce.fileoutputcommitter.algorithm.version).  Perhaps the change 
to FileOutputCommitter would involve adding another algorithm version (3) for 
performing a direct commit, separate from this change?  FileOutputCommitter 
could ideally be split into 3 subclasses today, 1 per algorithm version.

Another boolean configuration could be added to mapred-default.xml, called 
mapreduce.fileoutputcommitter.infer.algorithm.version, which would select the 
appropriate FileOutputCommitter algorithm based on the FileSystem for the Path 
in mapreduce.output.fileoutputformat.outputdir.  MRAppMaster could then select 
the appropriate algorithm depending on the semantics available in the target 
filesystem.

It might also make some sense to break up FileOutputCommitter algorithms into 
separate classes to accommodate for these changes, since the algorithm used may 
not work well on some FileSystem implementations (especially due to the 
attempts of performing atomic renames, on FileSystem implementations where 
directory renames are not atomic).

> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
>                 Key: HADOOP-9565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9565
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, fs/s3, fs/swift
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>            Assignee: Thomas Demoor
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to