Re: [PR] Utilize memory allocator in ReadProperties.GetStream [arrow-go]

via GitHub Thu, 30 Oct 2025 08:33:13 -0700


daniel-adam-tfs commented on code in PR #547:
URL: https://github.com/apache/arrow-go/pull/547#discussion_r2478573444



##########
parquet/file/page_reader.go:
##########
@@ -500,14 +508,31 @@ func (p *serializedPageReader) Page() Page {
        return p.curPage
 }
 
+func (p *serializedPageReader) readUncompressed(rd io.Reader, lenUncompressed 
int, buf []byte) ([]byte, error) {
+       n, err := io.ReadFull(rd, buf[:lenUncompressed])
+       if err != nil {
+               return nil, err
+       }
+       if n != lenUncompressed {
+               return nil, fmt.Errorf("parquet: expected to read %d bytes but 
only read %d", lenUncompressed, n)
+       }
+       if p.cryptoCtx.DataDecryptor != nil {

Review Comment:
   Alright, so I "steal" the buffer by using `Peek`/`Discard` if the data has 
been read previously and it is available of the `BufferedReader`. So in the 
uncompressed and unencrypted case -> data is read and stored into a buffer in 
`ReaderProperties.GetStream` and copied to the user provided buffer to 
`Float32ColumnChunkReader.ReadBatch`. 
   Now, if we have a plainEncoder and no compression, it should be possible to 
write the data directly to the user provided buffer, so that would eliminate 
even that copy, but one is more complicated and I need to be start doing other 
stuff. :D
   
   Also, the decryption types allocate buffers for the decrypted data. We could 
send it an already allocated buffer to use, or maybe do an in place decryption 
(if possible), or give it the custom allocator if it is set.
   
   Anyway, I'll fix the decryption for DataPageV2 next and I'll consider this 
one done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Utilize memory allocator in ReadProperties.GetStream [arrow-go]

Reply via email to