zeroshade commented on issue #863:
URL: https://github.com/apache/arrow-go/issues/863#issuecomment-4801081779

   > Capacity Destruction: Inside parquet/metadata/cleanup.go, the asynchronous 
runtime.AddCleanup task (or explicit defers) resets recycled buffers using 
data.ResizeNoShrink(0).
   
   `ResizeNoShrink(0)` doesn't reset the capacity of the buffers, it explicitly 
maintains the capacity and resets only the length. Resizing the buffer after 
the `ResizeNoShrink` call will not perform any realloc. The pool is recycled 
correctly.
   
   > Cumulative Allocated Memory: 13024.81 MB
   
   Isn't this caused by your own loop with the `make` inside of it?
   
   ```go
                                localFileData := make([]byte, maxSize)
                                copy(localFileData, headerBytes)
   ```
   
   65 * 100 * 2 MB = 13,000 MB
   
   So the cumulative memory allocation in your test is actually your own `make` 
in each iteration, not the `sync.Pool`.
   
   > Pool Pollution (Variable Buffer Sizes): The reader uses a single, shared 
r.bufferPool (sync.Pool) for everything. It simultaneously recycles tiny page 
headers (re-sliced to 256 bytes or 4 KB) and heavy Bloom Filter bitsets 
(expanded to 2 MB) into the exact same pool.
   
   This part is a legitimate concern and issue that should probably be fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to