[
https://issues.apache.org/jira/browse/HADOOP-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299789#comment-14299789
]
Steve Loughran commented on HADOOP-11525:
-----------------------------------------
If you look at HADOOP-9565 you can see that there is an existing patch for
filesystems to declare that they are object stores and have significantly
different semantics than just write-failure-has side effect. Specifically
consistency, whether rename and delete are atomic, whether the far end has some
copy operation that could be used, whether flush() does anything at all.
The real target for this is not so much the FS client, as other bits of code
(like the committer of MR operations), which needs to know whether a rename is
atomic before attempting speculative commits by rename.
That said, there's a risk that you end up with client code that's full of if()
statements to handle problems; a code an test mess. The alternative, though, is
to do what we do today: pretend everything looks like HDFS.
Note that the reason the HADOOP-9565 uses a bitmask is so that you can combine
those checks into one, look for the entire set of characteristics in one go.
While it may look low-level, I think it's a better strategy for extensibility
so -1 to the patch; put what is needed into HADOOP-9565 and then have
I would like to see the flag and extra tests incorporated into the blobstore
patch; get that patch into Hadoop ASAP. I'll do a reroll of that patch to get
it in sync with first.
We will also have to update the FS spec with a section on object stores and
their semantics.
> FileSystem should expose some performance characteristics for caller (e.g.,
> FsShell) to choose the right algorithm.
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-11525
> URL: https://issues.apache.org/jira/browse/HADOOP-11525
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools
> Affects Versions: 2.6.0
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Attachments: HADOOP-11525.000.patch
>
>
> When running {{hadoop fs -put}}, {{FsShell}} creates a {{._COPYING_.}} file
> on the target directory, and then renames it to target file when the write is
> done. However, for some targeted systems, such as S3, Azure and Swift, a
> partial failure write request (i.e., {{PUT}}) has not side effect, while the
> {{rename}} operation is expensive.
> {{FileSystem}} should expose some characteristics so that the operation such
> as {{CommandWithDestination#copyStreamToTarget()}} can detect and choose the
> right way to do.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)