adamreeve commented on PR #8671:
URL: https://github.com/apache/arrow-rs/pull/8671#issuecomment-3439398121

   The nested heap allocations within the T held by the Arc are already double 
counted, and this behaviour is documented here:
   
https://github.com/apache/arrow-rs/blob/0c8ab496ee125839a869a8fa0e4c4866025a2334/parquet/src/file/metadata/mod.rs#L284-L286
   So I think it's at least consistent that the item held directly in the Arc 
should also be counted twice. But yeah this could possibly be a bit smarter. 
Maybe this could all be refactored to track which items have been accounted for 
with pointer equality so things aren't counted twice? But that would be more 
complicated and require more time and memory to compute the heap size.
   
   > And what about Vec<Arc<T>>? Does sizeof for Arc include the pointers and 
ref counts as well?
   
   I think this works correctly. `size_of::<Arc<T>>()` will just be the size of 
one pointer to the heap allocated memory and this is included in the `HeapSize` 
implementation of `Vec<T>`. The `HeapSize` implementation for `Arc<T>` will 
then account for the size of `T` plus the ref counts, and delegate to the 
`HeapSize` implementation for `T` to include any heap memory used within `T`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to