[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622385#comment-14622385
 ] 

Thomas Demoor commented on HADOOP-9565:
---------------------------------------

Thanks for your review [~eddyxu]. Indeed, the patch contains code similar to 
that of [HADOOP-11525] as [~ste...@apache.org] previously merged that into 
003.patch. My additions to his work (diff between 003.patch and 004.patch) are 
mainly to the FileOutputCommitter.

h4.FileOutputCommitter
Most parts of the larger ecosystem use FileOutputCommitter: MapReduce, Spark, 
Tez, ... HBase is a notable counterexample: I don' know if they use rename a 
lot, I do know append is an operation typically not supported on object stores 
(thus I think HBase on an objectstore is not a very good usecase). I think you 
have a better view of the ecosystem than I do: do you know of hadoop filesystem 
"users" that do not use FileOutputCommitter but do "commit by renaming" 
themselves? I am willing to try to optimize them for objectstores, if the 
usecase makes sense. I will have a look at making {{distcp}} object-store aware 
as well (so expect 005.patch :P).

Currently, what we do is basically {{if(canWeUseAtomicWrite())}}, which then 
triggers one of the 2 (relatively) separate code paths. I don't think there's a 
simple one-liner one can change from {{write()}} to {{atomicWrite()}}: as HDFS 
is a single-writer POSIX-style filesystem, {{write()}} is accompanied by a 
whole scheme of other operations that together make up the (higher-level) 
atomic "commit operation". For instance for the FileOutputCommitter: object 
stores want a clean {{atomicWrite()}} but for HDFS it uses a scheme of 
temporary directories for each attempt and at the end it commits the 
"successful" attempt by renaming and deletes all other attempts. I do not see 
how that can be replaced by a simple interface {{write()}}/{{atomicWrite()}} 
while keeping backwards-compatibility (FileOutputCommitter is hardcoded in lots 
of applications), but suggestions would be very welcome.

h4.Slow / Async operations
I think the flags for slow operations are a good idea. I think 
[~ste...@apache.org]'s comment above on adding async operations takes that idea 
one step further. Some renames / deletes can be async, other can't but 
evidently only the client can know if async is possible for their codepath. I 
think these are good ideas to offer more options to users in the future, but it 
will require some PR to get them picked up. The code that is currently in the 
patch aims at things we can improve in a manner that is invisible to end-users. 
I think therefore these are best tracked in separate issues. 

[~ste...@apache.org] and [~ndimiduk]: afaik TTL is a feature of AWS Cloudfront 
(cache): i.e. TTL in the cache, get from S3 after that. This does not affect 
the S3 object. Furthermore, S3 has an "Object Expiration" feature, but its 
policy is defined per bucket so I'm not sure it's directly applicable here.

> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
>                 Key: HADOOP-9565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9565
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, fs/s3, fs/swift
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to