BearMinimum98 opened a new issue, #745:
URL: https://github.com/apache/arrow-rs-object-store/issues/745

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   We are trying to attach SHA-256 user metadata to multipart uploads to S3.
   
   Our current upload flow uses `MultipartStore::create_multipart + put_part + 
complete_multipart`, because it provides the scheduling, retry, and completion 
control we need. However, `MultipartStore::create_multipart` does not expose 
`PutMultipartOptions`; it calls the internal S3 create-multipart path with 
`PutMultipartOptions::default()`. This means callers using the low-level 
multipart API cannot attach user metadata during multipart upload initiation.
   
   The high-level `ObjectStore::put_multipart_opts` path does support this via 
`PutMultipartOptions`, and the AWS implementation already passes those options 
through to `S3Client::create_multipart`, so the backend implementation appears 
to have the necessary support.
   
   **Describe the solution you'd like**
   Expose a way to pass `PutMultipartOptions` when initiating a multipart 
upload through `MultipartStore`.
   
   One possible change would be adding a `create_multipart_opts` method:
   
   ```rust
   async fn create_multipart_opts(
       &self,
       path: &Path,
       opts: PutMultipartOptions,
   ) -> Result<MultipartId>;
   ```
   
   `create_multipart` could remain as the default-options convenience path.
   
   **Describe alternatives you've considered**
   * Use `ObjectStore::put_multipart_opts`: While this supports user metadata 
today, it returns a `MultipartUpload` writer rather than exposing the multipart 
upload ID. It also assigns part indexes internally using a monotonically 
increasing counter, so callers cannot explicitly choose which part index they 
are uploading. That makes it unsuitable for lower-level upload flows that need 
direct control over upload IDs, part indexes, part scheduling, completion 
inputs, and individual part retry behavior.
   * Extend `MultipartUpload` with a method such as 
`put_part_with_idx(part_idx, payload)` or `put_part_at(part_idx, payload)`: 
This would allow callers to use `ObjectStore::put_multipart_opts` while still 
controlling part index assignment. It would solve the metadata plus custom part 
scheduling case, although it would still not expose the underlying multipart 
upload ID for callers that need it.
   * Use `buffered::BufWriter::with_attributes`: This also supports attributes 
and eventually calls `put_multipart_opts`, but it is a writer-style API and 
does not provide direct control over multipart upload creation, part 
scheduling, completion inputs, or upload IDs.
   * Use the backend-specific AWS/S3 SDK or fork/wrap `object_store`: This 
would work, but loses the backend-agnostic `object_store` abstraction and 
duplicates functionality that already exists internally in the AWS 
implementation.
   
   **Additional context**
   The gap seems to be that the library supports metadata for multipart 
uploads, and it supports low-level multipart upload control, but there is 
currently no published API that supports both at the same time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to