clbarnes opened a new pull request, #5206:
URL: https://github.com/apache/arrow-rs/pull/5206

   # Which issue does this PR close?
   
   Relates to #4611, but should not close it.
   
   # Rationale for this change
    
   Certain use cases require suffix requests (see linked issue). Most stores 
support such requests directly; a workaround for Azure is trivial although not 
fast.
   
   # What changes are included in this PR?
   
   `pub trait GetSuffixClient`, with an implementation for all clients except 
Azure. Additionally, `ObjectStore::get_suffix`, with a default implementation 
with the workaround (HEAD then GET), which is then overridden in all stores 
except Azure.
   
   Also includes some of the infrastructure around full support for HTTP 
ranges, which is the direction this functionality should go in when we next 
make a breaking API change. That infra is probably overcomplicated for its use 
here, but gives a good foundation for full support later, e.g.  
https://github.com/apache/arrow-rs/commit/80cdf66d5fa5057384b39255a33bdbbf9d9e6b71
   
   # Are there any user-facing changes?
   
   Downstream implementors of `ObjectStore` should override the default 
`get_suffix` method, if their clients support a direct form, preferably by 
implementing `GetSuffixClient` on their client.
   
   # Remaining questions
   
   - `GetSuffixClient` probably doesn't need to be a trait as it doesn't depend 
on any behaviour and isn't used in generics; `get_suffix` could just be 
implemented directly on the client structs. However, it will make it much 
easier to delete suffix-specific behaviour when a breaking API  change is 
allowed.
   - This is a lot of code in a lot of places (much of it copy-pasted), but I 
don't think there's a way around that with the current API.
   - This probably needs checking for errors where the requested `nbytes` are 
longer than the resource. However, AFAICT that isn't currently done with 
current range requests outside of the local and memory stores.
   
   # Better solution
   
   Per the linked issue, GetOptions should contain a full representation of an 
HTTP range (implementing `From<Range<usize>>`) and this functionality should be 
built into `get_opts`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to