LiaCastaneda commented on issue #16841: URL: https://github.com/apache/datafusion/issues/16841#issuecomment-3570377340
> I think the challenge of any sort of arrow_pool based solution is that the arrow buffers themselves are shared, so it will be very hard to know when to know the memory is "reclaimed" and not to double count it 🤔 wouldn't that already solved by the arrow memory tracking API itself? From the a test in `arrow-buffer/src/bytes.rs` ``` // Reserve memory and assign to buffer. Claim twice. buffer.claim(&pool); assert_eq!(pool.used(), 1024); buffer.claim(&pool); assert_eq!(pool.used(), 1024); ``` When you call `claim()` twice on the same buffer (or on different Buffer instances that share the same Arc<Bytes>), it replaces the old reservation: 1. First claim() -> pool counter = 1024 2. Second claim() -> Creates new reservation: 1024 + 1024 = 2048, inmediately drops old reservation: 2048 - 1024 = 1024 we keep 1024 only, avoiding the overaccounting -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
