wiedld opened a new issue, #13679: URL: https://github.com/apache/datafusion/issues/13679
### Is your feature request related to a problem or challenge? ParquetSink (used for `COPY TO`) encodes bytes to parquet and writes to the sink (e.g. object store). It currently does not include retry logic for failed multipart PUTs to object store. We feel this is a gap due to exposure to network issues. ### Describe the solution you'd like Have the ability to automatically retry failed PUTs, which write through the ParquetSink. This can be a configurable option. ### Describe alternatives you've considered N/A ### Additional context _How easy is it to add PUT retry logic to ParquetSink?_ * If we want retry logic on the store-upload part only, that [occurs here in the ParquetSink](https://github.com/apache/datafusion/blob/fc703238b1d7794bd132a7fb6b97cad9ba4c7446/datafusion/core/src/datasource/file_format/parquet.rs#L1160-L1162). I believe the error returned includes the failed upload to object store. * Specifically, the `write_all` will use the [BufWriter::poll_write](https://github.com/apache/arrow-rs/blob/63ad87a8d79ecc14247297ddf0ff7707d4da284c/object_store/src/buffered.rs#L359-L404) which passes through the [object store multipart_put errors](https://github.com/apache/arrow-rs/blob/63ad87a8d79ecc14247297ddf0ff7707d4da284c/object_store/src/buffered.rs#L390). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
