[ 
https://issues.apache.org/jira/browse/ARROW-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488834#comment-17488834
 ] 

Carlos O'Ryan commented on ARROW-12509:
---------------------------------------

A 0-byte flush won't work.  GCS only accepts PUTs that are multiples of 256KiB. 
 Anything less than 256KiB finalizes the upload.  The client library refuses to 
flush data if it is smaller than the 256KiB quantum.  Even if you had a 
quantum: the testbench only checks the preconditions on the final PUT.  I am 
not sure how the service does it, but I would expect it to do the same: 
fetching the metadata is expensive, and you will need to fetch the metadata in 
the final PUT anyway (the object state may have changed).

 

I think you are running into another difference between POSIX filesystems and 
object storage.  There is no equivalent to {{O_CREAT | O_EXCL}} because the 
object is not created when you start the upload, is created at the end.  The 
object is not readable, or even visible, while being uploaded.

> [C++] More fine-grained control of file creation in filesystem layer
> --------------------------------------------------------------------
>
>                 Key: ARROW-12509
>                 URL: https://issues.apache.org/jira/browse/ARROW-12509
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Antoine Pitrou
>            Priority: Major
>
> {{FileSystem::OpenOutputStream}} silently truncates an existing file.
> It would be better to give more control to the user. Ideally, one could 
> choose between several options: "always overwrite and fail if doesn't exist", 
> "overwrite if exists, otherwise create", "creates if doesn't exist, otherwise 
> fails".
> One should research whether e.g. S3 supports such control.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to