nealrichardson commented on issue #11781: URL: https://github.com/apache/arrow/issues/11781#issuecomment-981605639
The arrow library is not a database, so it doesn't have transactions. If a function is in the middle of writing to disk and is interrupted, whatever it has already written will be on disk, partial or otherwise. If you wanted to make it atomic, you could `write_dataset()` to a `tempfile()` (directory) and then move that temp dir to your desired location after it finishes writing everything. If you wanted to use multiple processes to write to the same directory concurrently, you can provide a unique `basename_template` to each `write_dataset()` process so that they won't collide. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
