waynr commented on issue #4961: URL: https://github.com/apache/arrow-rs/issues/4961#issuecomment-1771674843
> Is there a particular reason you can't cache the AsyncWrite in say a HashMap and use it across multiple requests that way? Yeah, the service I'm writing is meant to be stateless with multiple instances running behind loadbalancers and no guarantee of affinity between client and backend. With my current implementation I persist session information (including upload id and most recent chunk number) in a database, re-load it on subsequent requests from the same client, and continue the multipart upload using that. >The challenge is many stores have restrictions on part sizes, with some forcing all but the last to be larger than a given size, and in some cases all having the same fixed size. This means you can't easily flush the writer and then pick it up again. The spec I'm implementing allows the service to inform clients [using an http response header what the minimum chunk size is](https://github.com/opencontainers/distribution-spec/blob/main/spec.md?plain=1#L325), which provides a reasonable level of assurance that all chunks will meet the minimum size requirement. So for my use case it's up to my code that would be using `object_store::ObjectStore` to reject requests that don't meet the minimum size requirement (likewise with anyone else using a hypothetical extension that allows continuation of a multipart upload). I imagine this should be compatible with `ObjectStore` as long as implementations aren't eagerly breaking down the data they receive along minimum part size and accidentally leaving an incomplete / too-small part and on the object store they are targeting. If they are doing that as an implementation detail of their `AsyncWrite` impls then maybe I'm just barking up the wrong tree here :upside_down_face: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
