CarlKCarlK commented on issue #5272:
URL: https://github.com/apache/arrow-rs/issues/5272#issuecomment-1874579108

   Thanks for your response. The API could, of course, work either way. Let me 
add a vote to the InMemory (and, I think, File) way:
   
   * If a user needs the length of the returned region, they can use 
`get_result.range.len()`. They don't need `get_result.meta.size`. (I like 
thinking of `meta` has something that is fixed, not changing on each read.)
   * When reading a file for the first time, it is often the case that we want 
to know the length of the total file and some header bytes.  It would be nice 
to be able to do this in one call. For example, I'm reading a genomics file 
called PLINK Bed. I need the file's total length and to check that it has the 
correct magic numbers https://en.wikipedia.org/wiki/List_of_file_signatures and 
to see which of the two BED flavors I have.
   
   I can also, of course argue that other side, too. :-)
   * Perhaps, AWS, etc. actually requires two calls to get the object length 
and the value of a region, so the current S3 way would be both easier to 
implement and a better reflection of the behind-the-scenes work.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to