devinjdangelo commented on PR #7992: URL: https://github.com/apache/arrow-datafusion/pull/7992#issuecomment-1786170672
`AsyncPutWriter` is not used in any execution plans currently ([for example see](https://github.com/apache/arrow-datafusion/blob/0d4dc3601b390c50640e841e70c522cd733e02f4/datafusion/core/src/datasource/file_format/csv.rs#L575)) . The two other modes (append, and put multipart) are used but neither depend on `AsyncPutWriter`. The only reason I see to use put over multipart is as a performance optimization when writing many small files to avoid the overhead of the additional remote requests to set up the multipart upload and complete it. Writing many small files is generally a bad idea in any case with object stores, which is why I didn't invest any time in this. Still, it would be good to ensure this method works as intended or is removed if we don't see a use case for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
