daniel-adam-tfs commented on code in PR #547:
URL: https://github.com/apache/arrow-go/pull/547#discussion_r2460244298


##########
parquet/file/page_reader.go:
##########
@@ -501,7 +504,16 @@ func (p *serializedPageReader) Page() Page {
 }
 
 func (p *serializedPageReader) decompress(rd io.Reader, lenCompressed int, buf 
[]byte) ([]byte, error) {
-       p.decompressBuffer.ResizeNoShrink(lenCompressed)
+       // As of go1.25.3: There is an issue when bytes.Buffer and io.CopyN are 
used together. io.CopyN
+       // uses io.LimitReader, which does an additional read on the underlying 
reader to determine EOF.
+       // However, bytes.Buffer always attempts to read at least bytes.MinRead 
(which is 512 bytes) from the
+       // underlying reader, even if there is less data available than that. 
So even if there are no more bytes,
+       // the buffer must have at least bytes.MinRead capacity remaining to 
avoid a relocation.
+       allocSize := lenCompressed
+       if p.decompressBuffer.Cap() < lenCompressed+bytes.MinRead {

Review Comment:
   This is dependent on the combined behavior of io.LimitReader and 
bytes.Buffer. Which seems fragile to me, but I don't have any other ideas how 
to deal with it. I'll at least at unit tests that the reallocation happens when 
I don't add bytes.MinRead to the allocation size and doesn't happen when I do. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to