clbarnes commented on issue #4612: URL: https://github.com/apache/arrow-rs/issues/4612#issuecomment-1856893520
Multipart ranges would definitely be useful in some situations but we probably still want an escape-hatch to allow getting ranges in parallel even besides backends which wouldn't support multipart (e.g. reading two large ranges rather than hundreds of tiny ranges). The HTTP spec is very broad as to what servers are allowed to send back - it doesn't need to be the same number of ranges, in the same order, or even the exact ranges you asked for. I suspect that last is true of single ranges too but it would be pretty psychopathic for a server to do anything besides the requested range or the full file. I have [an implementation](https://crates.io/crates/rope_rd) of a synchronous Read/Seek-based sparse representation of a file made up of real data and (zero-cost) filler bytes. So you'd request ranges A, B, and C, then the server would return ranges X and Y (which may or may not map cleanly onto your request), then you build a local representation of the entire resource using X and Y, then read A, B, and C out of that. I suppose we might prefer something implementing `bytes::Buf`, but the data structure is probably the right idea to deal with the multipart response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
