adamreeve commented on PR #8671: URL: https://github.com/apache/arrow-rs/pull/8671#issuecomment-3439398121
The nested heap allocations within the T held by the Arc are already double counted, and this behaviour is documented here: https://github.com/apache/arrow-rs/blob/0c8ab496ee125839a869a8fa0e4c4866025a2334/parquet/src/file/metadata/mod.rs#L284-L286 So I think it's at least consistent that the item held directly in the Arc should also be counted twice. But yeah this could possibly be a bit smarter. Maybe this could all be refactored to track which items have been accounted for with pointer equality so things aren't counted twice? But that would be more complicated and require more time and memory to compute the heap size. > And what about Vec<Arc<T>>? Does sizeof for Arc include the pointers and ref counts as well? I think this works correctly. `size_of::<Arc<T>>()` will just be the size of one pointer to the heap allocated memory and this is included in the `HeapSize` implementation of `Vec<T>`. The `HeapSize` implementation for `Arc<T>` will then account for the size of `T` plus the ref counts, and delegate to the `HeapSize` implementation for `T` to include any heap memory used within `T`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
