BearMinimum98 opened a new issue, #745:
URL: https://github.com/apache/arrow-rs-object-store/issues/745
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
We are trying to attach SHA-256 user metadata to multipart uploads to S3.
Our current upload flow uses `MultipartStore::create_multipart + put_part +
complete_multipart`, because it provides the scheduling, retry, and completion
control we need. However, `MultipartStore::create_multipart` does not expose
`PutMultipartOptions`; it calls the internal S3 create-multipart path with
`PutMultipartOptions::default()`. This means callers using the low-level
multipart API cannot attach user metadata during multipart upload initiation.
The high-level `ObjectStore::put_multipart_opts` path does support this via
`PutMultipartOptions`, and the AWS implementation already passes those options
through to `S3Client::create_multipart`, so the backend implementation appears
to have the necessary support.
**Describe the solution you'd like**
Expose a way to pass `PutMultipartOptions` when initiating a multipart
upload through `MultipartStore`.
One possible change would be adding a `create_multipart_opts` method:
```rust
async fn create_multipart_opts(
&self,
path: &Path,
opts: PutMultipartOptions,
) -> Result<MultipartId>;
```
`create_multipart` could remain as the default-options convenience path.
**Describe alternatives you've considered**
* Use `ObjectStore::put_multipart_opts`: While this supports user metadata
today, it returns a `MultipartUpload` writer rather than exposing the multipart
upload ID. It also assigns part indexes internally using a monotonically
increasing counter, so callers cannot explicitly choose which part index they
are uploading. That makes it unsuitable for lower-level upload flows that need
direct control over upload IDs, part indexes, part scheduling, completion
inputs, and individual part retry behavior.
* Extend `MultipartUpload` with a method such as
`put_part_with_idx(part_idx, payload)` or `put_part_at(part_idx, payload)`:
This would allow callers to use `ObjectStore::put_multipart_opts` while still
controlling part index assignment. It would solve the metadata plus custom part
scheduling case, although it would still not expose the underlying multipart
upload ID for callers that need it.
* Use `buffered::BufWriter::with_attributes`: This also supports attributes
and eventually calls `put_multipart_opts`, but it is a writer-style API and
does not provide direct control over multipart upload creation, part
scheduling, completion inputs, or upload IDs.
* Use the backend-specific AWS/S3 SDK or fork/wrap `object_store`: This
would work, but loses the backend-agnostic `object_store` abstraction and
duplicates functionality that already exists internally in the AWS
implementation.
**Additional context**
The gap seems to be that the library supports metadata for multipart
uploads, and it supports low-level multipart upload control, but there is
currently no published API that supports both at the same time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]