james-rms commented on PR #561:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/561#issuecomment-3752113145

    > I think we need to find a way to avoid regressing the common case of 
files smaller than 5GB, e.g. by first attempting CopyObject and then falling 
back if it errors (I am presuming S3 gives a sensible error here).
   
   This is what we get back from S3:
   
   ```
   <Error>
     <Code>InvalidRequest</Code>
     <Message>The specified copy source is larger than the maximum allowable 
size for a copy source: 5368709120</Message>
     <RequestId>...</RequestId>
     <HostId>...</HostId></Error>
   ```
   
   As explained by 
[AWS](https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html):
   
   ```
   This error might occur for the following reasons:
   
   An unpaginated ListBuckets request is made from an account that has an 
approved general purpose bucket quota higher than 10,000. You must make 
paginated requests to list the buckets in an account with more than 10,000 
buckets.
   
   The request is using the wrong signature version. Use AWS4-HMAC-SHA256 
(Signature Version 4).
   
   An access point can be created only for an existing bucket.
   
   The access point is not in a state where it can be deleted.
   
   An access point can be listed only for an existing bucket.
   
   The next token is not valid.
   
   At least one action must be specified in a lifecycle rule.
   
   At least one lifecycle rule must be specified.
   
   The number of lifecycle rules must not exceed the allowed limit of 1000 
rules.
   
   The range for the MaxResults parameter is not valid.
   
   SOAP requests must be made over an HTTPS connection.
   
   Amazon S3 Transfer Acceleration is not supported for buckets with non-DNS 
compliant names.
   
   Amazon S3 Transfer Acceleration is not supported for buckets with periods 
(.) in their names.
   
   The Amazon S3 Transfer Acceleration endpoint supports only virtual style 
requests.
   
   Amazon S3 Transfer Acceleration is not configured on this bucket.
   
   Amazon S3 Transfer Acceleration is disabled on this bucket.
   
   Amazon S3 Transfer Acceleration is not supported on this bucket. For 
assistance, contact [Support](https://aws.amazon.com/contact-us/).
   
   Amazon S3 Transfer Acceleration cannot be enabled on this bucket. For 
assistance, contact [Support](https://aws.amazon.com/contact-us/).
   
   Conflicting values provided in HTTP headers and query parameters.
   
   Conflicting values provided in HTTP headers and POST form fields.
   
   CopyObject request made on objects larger than 5GB in size.
   ```
   
   So the real question is if we make a CopyObject call, can we assume that any 
InvalidRequest that comes back is because the object was >5GB in size. Given 
the documentation I think that's OK? but i'm not really clear that it will 
remain OK going forward. I'll push a commit that does this and let you decide 
on the approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to