[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876048#comment-17876048
 ] 

Steve Loughran commented on HADOOP-19256:
-----------------------------------------

Full docs

https://docs.aws.amazon.com/AmazonS3/latest/userguide/conditional-requests.html

[~vjasani]
bq. we need S3A configs for each of the getObject, headObject and copyObject 
headers

we already use conditional headers in read operations, using version or etag of 
a file to ensure that every GET in an input stream either always picks up the 
same file version (versioned option) or just etag validation (default)...see 
the docs on change detection

what is new is that we can have writes fail if there's something at the far end.
 
bq. FileSystem APIs do not have "Map<String, Object>" type input for file 
operation metadata, otherwise S3A could leverage it.

we have *exactly* this for openFile() and createFile(). And 
MultipartUploaderBuilder. We don't have it for rename.

I think now it is in GA and so available across "real" S3 we can add a switch 
just to use it as our way of enforcing no-overwrite on create and for atomic 
file rename. for multipart, I'm not sure. Though I don't know if anyone uses 
that feature; we never cherrypicked it into Cloudera CDH and nobody has written 
the distcp replacement which could take maximum advantage of it.


I was thinking about stores with this feature in HADOOP-19251


* add an option to use If-None-Match on create (overwrite=false) and copy in 
rename operations, etc
* create: fail in close(). document this. now, if overwrite=false, outside a 
_magic path (where we turn off the safety checks), we could just enable it and 
cut a HEAD request, provided everything is happy with the belated failure. It 
is after all actually more correct than the nonatomic HEAD+PUT sequence.
* make sure that 409 and 412 are handled appropriately. we do catch and map 
them, but need to verify they fail meaningfully. As usual, rename is the PITA 
here, we'd have to map to "false".
* path capabilities probes to declare when the client is configured to use the 
if-none-match on create, that file rename is atomic. We add a similar flag for 
dir rename, and report file/dir renames as atomic in localfs (big assumption 
about mount points there...), hdfs, azure with HNS etc.


Then make sure that commit by renaming a single file *and requiring fast fail 
on existence* is correct. 


> Support S3 Conditional Writes
> -----------------------------
>
>                 Key: HADOOP-19256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19256
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Ahmar Suhail
>            Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to