tustvold commented on PR #643: URL: https://github.com/apache/arrow-rs-object-store/pull/643#issuecomment-3908784383
I think calling fsync on the files makes a lot of sense, and ensures that we maintain atomicity (or at least try do our best to). Calling fsync on directories also makes sense, but I think this PR doesn't go far enough, it needs to ensure that any recursively created directories are also fsynced... I think it makes sense for this to be enabled by default, but it should be possible to turn this behaviour off, perhaps with a separate option that just fsync's files, ensuring atomicity. There is also the question to me of what happens if a process is writing lots of files to the same directory, fsyncing every new file is rather wasteful, you really want some mechanism for the caller to fsync as part of some higher-level transaction. I don't know how we would expose this though... Perhaps it doesn't matter - LocalFileSystem inherently trades performance for being able to interoperate with a filesystem, if someone wants the optimal disk performance they're probably better off using io_uring and something custom anyway. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
