clbarnes commented on issue #4612:
URL: https://github.com/apache/arrow-rs/issues/4612#issuecomment-1856893520

   Multipart ranges would definitely be useful in some situations but we 
probably still want an escape-hatch to allow getting ranges in parallel even 
besides backends which wouldn't support multipart (e.g. reading two large 
ranges rather than hundreds of tiny ranges). The HTTP spec is very broad as to 
what servers are allowed to send back - it doesn't need to be the same number 
of ranges, in the same order, or even the exact ranges you asked for. I suspect 
that last is true of single ranges too but it would be pretty psychopathic for 
a server to do anything besides the requested range or the full file.
   
   I have [an implementation](https://crates.io/crates/rope_rd) of a 
synchronous Read/Seek-based sparse representation of a file made up of real 
data and (zero-cost) filler bytes. So you'd request ranges A, B, and C, then 
the server would return ranges X and Y (which may or may not map cleanly onto 
your request), then you build a local representation of the entire resource 
using X and Y, then read A, B, and C out of that. I suppose we might prefer 
something implementing `bytes::Buf`, but the data structure is probably the 
right idea to deal with the multipart response.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to