[
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876048#comment-17876048
]
Steve Loughran commented on HADOOP-19256:
-----------------------------------------
Full docs
https://docs.aws.amazon.com/AmazonS3/latest/userguide/conditional-requests.html
[~vjasani]
bq. we need S3A configs for each of the getObject, headObject and copyObject
headers
we already use conditional headers in read operations, using version or etag of
a file to ensure that every GET in an input stream either always picks up the
same file version (versioned option) or just etag validation (default)...see
the docs on change detection
what is new is that we can have writes fail if there's something at the far end.
bq. FileSystem APIs do not have "Map<String, Object>" type input for file
operation metadata, otherwise S3A could leverage it.
we have *exactly* this for openFile() and createFile(). And
MultipartUploaderBuilder. We don't have it for rename.
I think now it is in GA and so available across "real" S3 we can add a switch
just to use it as our way of enforcing no-overwrite on create and for atomic
file rename. for multipart, I'm not sure. Though I don't know if anyone uses
that feature; we never cherrypicked it into Cloudera CDH and nobody has written
the distcp replacement which could take maximum advantage of it.
I was thinking about stores with this feature in HADOOP-19251
* add an option to use If-None-Match on create (overwrite=false) and copy in
rename operations, etc
* create: fail in close(). document this. now, if overwrite=false, outside a
_magic path (where we turn off the safety checks), we could just enable it and
cut a HEAD request, provided everything is happy with the belated failure. It
is after all actually more correct than the nonatomic HEAD+PUT sequence.
* make sure that 409 and 412 are handled appropriately. we do catch and map
them, but need to verify they fail meaningfully. As usual, rename is the PITA
here, we'd have to map to "false".
* path capabilities probes to declare when the client is configured to use the
if-none-match on create, that file rename is atomic. We add a similar flag for
dir rename, and report file/dir renames as atomic in localfs (big assumption
about mount points there...), hdfs, azure with HNS etc.
Then make sure that commit by renaming a single file *and requiring fast fail
on existence* is correct.
> Support S3 Conditional Writes
> -----------------------------
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Ahmar Suhail
> Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available -
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>
> S3A should allow passing in this put-if-absent header to prevent over writing
> of files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]