[
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567097#comment-14567097
]
Steve Loughran commented on HADOOP-9565:
----------------------------------------
I know that lots of client apps are using the old FileSystem API, but here we
are adding a new feature with new methods, targeted at apps that want to
differentiate real posix-ish filesystems from object stores. This is new code.
I've been thinking about the problem about non-atomic and slow operations (esp
rename and delete) recently.
Would it make sense to add asynchronous operations for these, in both object
store and filecontext? Something like
{code}
Future renameAsync(Path src, Path dst, boolean recurse)
Future rmAsync(Path src, Path dst, boolean recurse)
Future copyAsync(Path src, Path dst, boolean recurse)
{code}
All filesystems could use a thread pool for operations; rename() and rm() would
be fast, while copy would be the slow one. On an object store, rename and RM
would be slow; copy could be one remotely if it supported a COPY operation, and
that rename could be implemented as COPY+delete. Furthermore, as pointed out to
me by [~ndimiduk], for AWS S3 you can do an async rm by setting the TTL of a
path to 1s and letting the object store do all the heavy lifting.
The {{Future}} would return when the request had been submitted to the FS
(throwing any IOE if needed); we'd have no guarantees about visibility to
callers.
Making these operations async would make it clear that they were potentially
slow and not immediately visible, and allow apps to offload work so their
sequence of actions would be fast (e.g. responding to user requests), with that
slowness handled consistency elsewhere, and allowing for object stores to
implement their algorithms appropriately. It wouldn't be enough on its own for
a blobstore-aware output committer (due to the need for visibility to all
callers), but could be a start.
> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs, fs/s3, fs/swift
> Affects Versions: 2.6.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Labels: BB2015-05-TBR
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch,
> HADOOP-9565-003.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really
> blobstores, with different atomicity and consistency guarantees, by adding a
> {{Blobstore}} interface to add to them.
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that
> all blobstores implement at server-side copy operation as a substitute for
> rename.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)