[
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622385#comment-14622385
]
Thomas Demoor commented on HADOOP-9565:
---------------------------------------
Thanks for your review [~eddyxu]. Indeed, the patch contains code similar to
that of [HADOOP-11525] as [[email protected]] previously merged that into
003.patch. My additions to his work (diff between 003.patch and 004.patch) are
mainly to the FileOutputCommitter.
h4.FileOutputCommitter
Most parts of the larger ecosystem use FileOutputCommitter: MapReduce, Spark,
Tez, ... HBase is a notable counterexample: I don' know if they use rename a
lot, I do know append is an operation typically not supported on object stores
(thus I think HBase on an objectstore is not a very good usecase). I think you
have a better view of the ecosystem than I do: do you know of hadoop filesystem
"users" that do not use FileOutputCommitter but do "commit by renaming"
themselves? I am willing to try to optimize them for objectstores, if the
usecase makes sense. I will have a look at making {{distcp}} object-store aware
as well (so expect 005.patch :P).
Currently, what we do is basically {{if(canWeUseAtomicWrite())}}, which then
triggers one of the 2 (relatively) separate code paths. I don't think there's a
simple one-liner one can change from {{write()}} to {{atomicWrite()}}: as HDFS
is a single-writer POSIX-style filesystem, {{write()}} is accompanied by a
whole scheme of other operations that together make up the (higher-level)
atomic "commit operation". For instance for the FileOutputCommitter: object
stores want a clean {{atomicWrite()}} but for HDFS it uses a scheme of
temporary directories for each attempt and at the end it commits the
"successful" attempt by renaming and deletes all other attempts. I do not see
how that can be replaced by a simple interface {{write()}}/{{atomicWrite()}}
while keeping backwards-compatibility (FileOutputCommitter is hardcoded in lots
of applications), but suggestions would be very welcome.
h4.Slow / Async operations
I think the flags for slow operations are a good idea. I think
[[email protected]]'s comment above on adding async operations takes that idea
one step further. Some renames / deletes can be async, other can't but
evidently only the client can know if async is possible for their codepath. I
think these are good ideas to offer more options to users in the future, but it
will require some PR to get them picked up. The code that is currently in the
patch aims at things we can improve in a manner that is invisible to end-users.
I think therefore these are best tracked in separate issues.
[[email protected]] and [~ndimiduk]: afaik TTL is a feature of AWS Cloudfront
(cache): i.e. TTL in the cache, get from S3 after that. This does not affect
the S3 object. Furthermore, S3 has an "Object Expiration" feature, but its
policy is defined per bucket so I'm not sure it's directly applicable here.
> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs, fs/s3, fs/swift
> Affects Versions: 2.6.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Labels: BB2015-05-TBR
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch,
> HADOOP-9565-003.patch, HADOOP-9565-004.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really
> blobstores, with different atomicity and consistency guarantees, by adding a
> {{Blobstore}} interface to add to them.
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that
> all blobstores implement at server-side copy operation as a substitute for
> rename.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)