james-rms commented on issue #563: URL: https://github.com/apache/arrow-rs-object-store/issues/563#issuecomment-3623947669
Continuing discussion from @tustvold 's previous comment on #561 : > I think this probably warrants a higher level ticket to discuss how we should support this, as a start it would be good to understand how other stores, i.e. GCS and Azure handle this, so that we can develop an abstraction that makes sense. [GCS](https://docs.cloud.google.com/storage/docs/copying-renaming-moving-objects) has no per-operation limit on copies within a bucket. However the rewrite API (which is what they call this) may not complete in a single request, and may return a rewriteToken which has to be repeatedly provided in requests until the response returns `done: true`. [Azure](https://learn.microsoft.com/en-us/rest/api/storageservices/copy-blob?tabs=microsoft-entra-id) appears to have no corresponding multi-part copy operation, see quote: > The Copy Blob operation always copies the entire source blob or file. Copying a range of bytes or set of blocks is not supported. Back to the comment: > In particular I wonder if adding this functionality would make more sense as part of the multipart upload functionality? This of course depends on what other stores support. Two reasons steer me away from providing this API as an option for multipart uploads: 1. This API is somewhat unique to AWS, as demonstrated above 2. the action the user's trying to complete is a a copy, not an upload. If multipart copies are considered multipart uploads, then shouldn't single-request copy be an option on the `put_opts` method? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
