[
https://issues.apache.org/jira/browse/HADOOP-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264495#comment-14264495
]
Steve Loughran commented on HADOOP-11452:
-----------------------------------------
# We can't remove {{rename()}. People would be surprised and upset. Therefore
"constrain" is not something you can now mandate. Sorry.
# We can declare that it SHOULD be atomic —which is precisely what we do in
[the FS
specs|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/introduction.html]
# Some extensions to object stores (e.g. Netflix S3mper) do retrofit atomicity
to rename operations, so can be used as the destination of speculative
operations.
# HADOOP-9565 proposes moving filesystems that are really object stores to
under a {{BlobStore}} subclass of {{FileSystem}} and to offer a way to get a
bitmask of consistency and atomicity features. The design is intended to allow
subclasses (e.g S3mper) to override semantics, and alternate S3 and swift
service providers to offer stricter semantics.
# Code that wants to explicitly check for the required semantics could then
look for this interface and, if present, get the semantics. Actually, if we
really care, we may want to push it to the base FS class -as a barrier for
apps, not as a way for them to work around things.
If you look at {{FileSystem#protected void rename(final Path src, final Path
dst, final Rename... options) }} in detail, you can see that apart from
HDFS/webhdfs we aren't implementing rename() atomically. Specifically
# we look for the file existing
# raise an error if the condition (source is dir, dest is file) &&
overwrite==false
# then rename
There's a race condition between the stat and rename. Maybe now that we support
Java7+ only we can think about using native IO operations which offer better
atomicity.
We can talk about how to expose this stuff, which is something you should raise
on HDFS list. Hadoop-common may be were the APIs live, but its hadoop-dev that
owns the semantics.
Finally, regarding {{FileSystemRMStateStore}}. If that were moved to
{{FileContext}} it gets the public APIs. YARN already depends on
implementations of {{AbstractFileSystem}}
> Revisit FileSystem#rename
> -------------------------
>
> Key: HADOOP-11452
> URL: https://issues.apache.org/jira/browse/HADOOP-11452
> Project: Hadoop Common
> Issue Type: Task
> Components: fs
> Reporter: Yi Liu
> Assignee: Yi Liu
>
> Currently in {{FileSystem}}, {{rename}} with _Rename options_ is protected
> and with _deprecated_ annotation. And the default implementation is not
> atomic.
> So this method is not able to be used outside. On the other hand, HDFS has a
> good and atomic implementation. (Also an interesting thing in {{DFSClient}},
> the _deprecated_ annotations for these two methods are opposite).
> It makes sense to make public for {{rename}} with _Rename options_, since
> it's atomic for rename+overwrite, also it saves RPC calls if user desires
> rename+overwrite.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)