ashtuchkin opened a new issue, #5443:
URL: https://github.com/apache/arrow-rs/issues/5443
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
To protect integrity of our uploads, we store object versions together with
s3 file paths. This is currently easy for regular `ObjectStore::put` requests,
but is not possible with multipart uploads when using
`ObjectStore::put_multipart`. The current return values of the latter
(MultipartId and AsyncWrite) do not allow for an easy return of values, so
maybe it's possible to add this to `WriteMultiPart` (so that we don't
reimplement it using just `MultiPartStore`/`PutPart` traits).
**Describe the solution you'd like**
1. (Optional) Make it possible to construct a PutPart from a MultiPartStore.
Currently when you do `create_multipart`, it just returns a multipart id. Maybe
add a `create_multipart_put_part(&self, path: &Path) -> Result<Box<dyn
PutPart>>`? This will enable using WriteMultiPart on any MultiPartStore from
the client code. This is good because then users can tune WriteMultiPart
parameters like concurrency and potentially part size.
1. `PutPart::complete` should return `Result<PutResult>`
2. `WriteMultiPart` should remember this PutResult and provide a method to
take it after shutdown is polled to completion.
After a bit more analysis, it looks like PutPart and MultiPartStore are
duplicating put_part and complete methods. Plus, it looks like you can create a
generic PutPart implementation given just the MultiPartStore interface. What do
you think about simplifying this? There are 2 options:
1. Only keep create_multipart method in MultiPartStore, make it return
PutPart. Add "abort" and "get_upload_id" methods to the PutPart. This way
2. Keep MultiPartStore as-is, remove PutPart trait completely, make
WriteMultiPart work with MultiPartStore directly. This will require adding
location and upload_id to WriteMultiPart.
**Describe alternatives you've considered**
* Reimplement WriteMultiPart logic with the changes above. There's a lot of
code there though.
* Issue a HEAD request right after the put. This will work in most cases,
but is not guaranteed.
**Additional context**
I'm using version 0.9.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]