clbarnes opened a new pull request, #5206:
URL: https://github.com/apache/arrow-rs/pull/5206
# Which issue does this PR close?
Relates to #4611, but should not close it.
# Rationale for this change
Certain use cases require suffix requests (see linked issue). Most stores
support such requests directly; a workaround for Azure is trivial although not
fast.
# What changes are included in this PR?
`pub trait GetSuffixClient`, with an implementation for all clients except
Azure. Additionally, `ObjectStore::get_suffix`, with a default implementation
with the workaround (HEAD then GET), which is then overridden in all stores
except Azure.
Also includes some of the infrastructure around full support for HTTP
ranges, which is the direction this functionality should go in when we next
make a breaking API change. That infra is probably overcomplicated for its use
here, but gives a good foundation for full support later, e.g.
https://github.com/apache/arrow-rs/commit/80cdf66d5fa5057384b39255a33bdbbf9d9e6b71
# Are there any user-facing changes?
Downstream implementors of `ObjectStore` should override the default
`get_suffix` method, if their clients support a direct form, preferably by
implementing `GetSuffixClient` on their client.
# Remaining questions
- `GetSuffixClient` probably doesn't need to be a trait as it doesn't depend
on any behaviour and isn't used in generics; `get_suffix` could just be
implemented directly on the client structs. However, it will make it much
easier to delete suffix-specific behaviour when a breaking API change is
allowed.
- This is a lot of code in a lot of places (much of it copy-pasted), but I
don't think there's a way around that with the current API.
- This probably needs checking for errors where the requested `nbytes` are
longer than the resource. However, AFAICT that isn't currently done with
current range requests outside of the local and memory stores.
# Better solution
Per the linked issue, GetOptions should contain a full representation of an
HTTP range (implementing `From<Range<usize>>`) and this functionality should be
built into `get_opts`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]