daniel-adam-tfs commented on issue #540:
URL: https://github.com/apache/arrow-go/issues/540#issuecomment-3437951288

   So I switched to main and redid the memory profile. If I understand 
https://github.com/apache/arrow-go/commit/f0b6fd9eacfd244cdef200a6115873e8279f4297
 correct, the point was for serializedPageReader.decompress to allocate memory 
using the memory package. However, the call to `io.CopyN` might trigger a 
rellocation, in my memory profile it did so 2/3 of the time, because `io.CopyN` 
wraps the source buffer in LimitReader struct
   
   
https://github.com/golang/go/blob/39ed968832ad8923a4bd1fb6bc3d9090ddd98401/src/io/io.go#L364C27-L364C38
   
   and continues to
   
   
https://github.com/golang/go/blob/39ed968832ad8923a4bd1fb6bc3d9090ddd98401/src/io/io.go#L415
   
   which in our case goes through
   
   
https://github.com/golang/go/blob/39ed968832ad8923a4bd1fb6bc3d9090ddd98401/src/bytes/buffer.go#L215
   
   
   And the loop here is causing the issue, because the loop exists only when 
you hit EOF, however LimitReader never returns EOF on the first read, so even 
when you read everything in one go. So the iteration in ReadFrom continues for 
one more run, but that means that you must have at least MinRead (=512) bytes 
still available in your buffer, which we don't have. 
   
   TL;DR you need to resize the decompress buffer to desired size + MinRead to 
avoid the reallocation. 🏆 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to