james-rms commented on code in PR #561:
URL:
https://github.com/apache/arrow-rs-object-store/pull/561#discussion_r2755822528
##########
src/aws/builder.rs:
##########
@@ -189,6 +199,11 @@ pub struct AmazonS3Builder {
request_payer: ConfigValue<bool>,
/// The [`HttpConnector`] to use
http_connector: Option<Arc<dyn HttpConnector>>,
+ /// Threshold (bytes) above which copy uses multipart copy. If not set,
all copies are performed
+ /// as single requests.
+ multipart_copy_threshold: Option<ConfigValue<u64>>,
+ /// Preferred multipart copy part size (bytes). If not set, defaults to 5
GiB.
Review Comment:
It's not explicitly documented ([though there is a note around range
requests](https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance-guidelines.html#optimizing-performance-guidelines-get-range)).
There's every chance it's faster to line up the part boundaries exactly, but I
haven't tested this directly.
I hadn't thought about it and this bothers me. It doesn't feel good to be
opaquely putting part boundaries into a copied object that the user has no
visibility into. This also points to the right API being one where the user
creates parts directly and keeps track of the boundaries.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]