asimsedhain commented on PR #9015: URL: https://github.com/apache/arrow-datafusion/pull/9015#issuecomment-1974886217
I believe I found the reason why the CI is failing after adding the `checked_sub` calls. I traced out all the calls issued to memory pool in the test [`aggregate_source_with_yielding_with_spill` ](https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-plan/src/aggregates/mod.rs#L1788-L1793). ``` 1 Register: GroupedHashAggregateStream[0] 2 GroupedHashAggregateStream[0] about to grow. additional: 120 before grow used: 0 after grow used: 120 3 GroupedHashAggregateStream[0] about to shrink. shrink: 120 before shrink used: 120 after shrink used: 0 4 Unregister: GroupedHashAggregateStream[0] 5 Register: GroupedHashAggregateStream[0] 6 Register: GroupedHashAggregateStream[0] 7 GroupedHashAggregateStream[0] about to grow. additional: 208 before grow used: 0 after grow used: 208 8 GroupedHashAggregateStream[0] about to grow. additional: 120 before grow used: 0 after grow used: 328 9 GroupedHashAggregateStream[0] about to shrink. shrink: 48 before shrink used: 328 after shrink used: 280 10 GroupedHashAggregateStream[0] about to shrink. shrink: 40 before shrink used: 280 after shrink used: 240 11 GroupedHashAggregateStream[0] about to shrink. shrink: 120 before shrink used: 240 after shrink used: 120 12 Unregister: GroupedHashAggregateStream[0] 13 GroupedHashAggregateStream[0] about to grow. additional: 348 before grow used: 0 after grow used: 468 14 GroupedHashAggregateStream[0] about to grow. additional: 1248 before grow used: 0 after grow used: 1716 15 GroupedHashAggregateStream[0] about to grow. additional: 120 before grow used: 0 after grow used: 1836 16 GroupedHashAggregateStream[0] about to shrink. shrink: 36 before shrink used: 1836 after shrink used: 1800 17 GroupedHashAggregateStream[0] about to grow. additional: 36 before grow used: 0 after grow used: 1836 18 GroupedHashAggregateStream[0] about to shrink. shrink: 36 before shrink used: 1836 after shrink used: 1800 19 GroupedHashAggregateStream[0] about to shrink. shrink: 12 before shrink used: 1800 after shrink used: 1788 20 GroupedHashAggregateStream[0] about to shrink. shrink: 56 before shrink used: 1788 after shrink used: 1732 21 GroupedHashAggregateStream[0] about to shrink. shrink: 1596 before shrink used: 1732 after shrink used: 136 22 GroupedHashAggregateStream[0] about to shrink. shrink: 136 before shrink used: 136 after shrink used: 0 23 Unregister: GroupedHashAggregateStream[0] ``` On line 5 and 6, two consumers register themselves with the same name ([these 2 aggregation happen here](https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-plan/src/aggregates/mod.rs#L1567-L1588)]. Seems like assuming each consumer to have a unique name is not correct. Another approach I can think of right now is to store a [`Weak`](https://doc.rust-lang.org/std/sync/struct.Weak.html) ref to the memory consumer but I believe that might change the MemoryPool Api. @alamb could you let me know if you have any suggestion on how to proceed forward? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org