Re: [I] [Go][Parquet] Use custom allocator in ReaderProperties.GetStream [arrow-go]

via GitHub Mon, 27 Oct 2025 09:04:44 -0700


zeroshade commented on issue #540:
URL: https://github.com/apache/arrow-go/issues/540#issuecomment-3452087740


   > Maybe using io.ReadFull also needs to call r.Read multiple times when 
reading from S3 🧐 (We are also reading files from S3).
   
   That's precisely the case, both `io.ReadFull` and `io.CopyN` will perform 
multiple reads in a loop if the first read doesn't return enough (see 
https://cs.opensource.google/go/go/+/refs/tags/go1.25.3:src/io/io.go;l=333)
   
   > So I instead of reading to p.decompressBuffer and then copying to buf, I 
could just read directly to buf in this case and avoid the intermediate buffer 
totally.
   
   > EDIT: Which is what happens for PageType_DATA_PAGE_V2, but couldn't the 
same be done for format.PageType_DICTIONARY_PAGE and format.PageType_DATA_PAGE 
in serializedPageReader.Next?
   
   That's definitely an oversight I think and a valid solution here. When 
there's no compression you could read directly to buf, that should be fine.
   
   >  @zeroshade was there a reason why io.CopyN was used instead of 
io.ReadFull?
   
   I don't remember offhand, the biggest difference between `ReadFull` and 
`CopyN` in this case would be that `ReadFull` can take the byte slice directly 
(instead of needing the intermediate bytes.NewBuffer). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Go][Parquet] Use custom allocator in ReaderProperties.GetStream [arrow-go]

Reply via email to