Barre opened a new pull request, #643:
URL: https://github.com/apache/arrow-rs-object-store/pull/643

   Call sync_all() on written files and fsync parent directories at all 
write-path boundaries (put, copy, rename, multipart complete) so that a 
successful return guarantees data is durable on disk, matching the implicit 
contract of cloud object stores.
   
   
   # Rationale for this change
    
   When LocalFileSystem::put (or copy/rename/multipart complete) returns Ok, 
callers reasonably expect the data to be durable on disk as this is the 
implicit contract of every cloud object store like S3 or GCS. 
   
   However, LocalFileSystem never called fsync/sync_all, meaning the OS was 
free to keep the data in its page cache indefinitely. A crash or power loss 
after a successful put could result in data loss or zero-length files.
   
   This change adds sync_all() calls on written files and fsync on parent 
directories at every write-path boundary (put_opts, copy_opts, rename_opts, 
multipart complete), ensuring that when an operation returns success, both the 
file contents and the directory entry pointing to them are durable on stable 
storage.
   
   # Are there any user-facing changes?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to