[ 
https://issues.apache.org/jira/browse/HADOOP-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264495#comment-14264495
 ] 

Steve Loughran commented on HADOOP-11452:
-----------------------------------------

# We can't remove {{rename()}. People would be surprised and upset. Therefore 
"constrain" is not something you can now mandate. Sorry.
# We can declare that it SHOULD be atomic —which is precisely what we do in 
[the FS 
specs|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/introduction.html]
# Some extensions to object stores (e.g. Netflix S3mper) do retrofit atomicity 
to rename operations, so can be used as the destination of speculative 
operations.
# HADOOP-9565 proposes moving filesystems that are really object stores to 
under a {{BlobStore}} subclass of {{FileSystem}} and to offer a way to get a 
bitmask of consistency and atomicity features. The design is intended to allow 
subclasses (e.g S3mper) to override semantics, and alternate S3 and swift 
service providers to offer stricter semantics.
# Code that wants to explicitly check for the required semantics could then 
look for this interface and, if present, get the semantics. Actually, if we 
really care, we may want to push it to the base FS class -as a barrier for 
apps, not as a way for them to work around things.

If you look at {{FileSystem#protected void rename(final Path src, final Path 
dst,  final Rename... options) }} in detail, you can see that apart from 
HDFS/webhdfs we aren't implementing rename() atomically. Specifically
# we look for the file existing
# raise an error if the condition (source is dir, dest is file) && 
overwrite==false
# then rename

There's a race condition between the stat and rename. Maybe now that we support 
Java7+ only we can think about using native IO operations which offer better 
atomicity.

We can talk about how to expose this stuff, which is something you should raise 
on  HDFS list. Hadoop-common may be were the APIs live, but its hadoop-dev that 
owns the semantics.

Finally, regarding {{FileSystemRMStateStore}}. If that were moved to 
{{FileContext}} it gets the public APIs. YARN already depends on 
implementations of {{AbstractFileSystem}}

> Revisit FileSystem#rename
> -------------------------
>
>                 Key: HADOOP-11452
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11452
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>
> Currently in {{FileSystem}}, {{rename}} with _Rename options_ is protected 
> and with _deprecated_ annotation. And the default implementation is not 
> atomic.
> So this method is not able to be used outside. On the other hand, HDFS has a 
> good and atomic implementation. (Also an interesting thing in {{DFSClient}}, 
> the _deprecated_ annotations for these two methods are opposite).
> It makes sense to make public for {{rename}} with _Rename options_, since 
> it's atomic for rename+overwrite, also it saves RPC calls if user desires 
> rename+overwrite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to