[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567097#comment-14567097
 ] 

Steve Loughran commented on HADOOP-9565:
----------------------------------------

I know that lots of client apps are using the old FileSystem API, but here we 
are adding a new feature with new methods, targeted at apps that want to 
differentiate real posix-ish filesystems from object stores. This is new code.

I've been thinking about the problem about non-atomic and slow operations (esp 
rename and delete) recently.

Would it make sense to add asynchronous operations for these, in both object 
store and filecontext?  Something like

{code}
Future renameAsync(Path src, Path dst, boolean recurse)
Future rmAsync(Path src, Path dst, boolean recurse)
Future copyAsync(Path src, Path dst, boolean recurse)
{code}

All filesystems could use a thread pool for operations; rename() and rm() would 
be fast, while copy would be the slow one. On an object store, rename and RM 
would be slow; copy could be one remotely if it supported a COPY operation, and 
that rename could be implemented as COPY+delete. Furthermore, as pointed out to 
me by [~ndimiduk], for AWS S3 you can do an async rm by setting the TTL of a 
path to 1s and letting the object store do all the heavy lifting.

The {{Future}} would return when the request had been submitted to the FS 
(throwing any IOE if needed); we'd have no guarantees about visibility to 
callers.

Making these operations async would make it clear that they were potentially 
slow and not immediately visible, and allow apps to offload work so their 
sequence of actions would be fast (e.g. responding to user requests), with that 
slowness handled consistency elsewhere, and allowing for object stores to 
implement their algorithms appropriately. It wouldn't be enough on its own for 
a blobstore-aware output committer (due to the need for visibility to all 
callers), but could be a start.





> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
>                 Key: HADOOP-9565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9565
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, fs/s3, fs/swift
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to