asimsedhain commented on PR #9015:
URL: 
https://github.com/apache/arrow-datafusion/pull/9015#issuecomment-1974886217

   I believe I found the reason why the CI is failing after adding the 
`checked_sub` calls.
   I traced out all the calls issued to memory pool in the test 
[`aggregate_source_with_yielding_with_spill` 
](https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-plan/src/aggregates/mod.rs#L1788-L1793).
   
   ```
       1 Register: GroupedHashAggregateStream[0]
       2 GroupedHashAggregateStream[0] about to grow. additional: 120 before 
grow used: 0 after grow used: 120
       3 GroupedHashAggregateStream[0] about to shrink. shrink: 120 before 
shrink used: 120 after shrink used: 0
       4 Unregister: GroupedHashAggregateStream[0]
       5 Register: GroupedHashAggregateStream[0]
       6 Register: GroupedHashAggregateStream[0]
       7 GroupedHashAggregateStream[0] about to grow. additional: 208 before 
grow used: 0 after grow used: 208
       8 GroupedHashAggregateStream[0] about to grow. additional: 120 before 
grow used: 0 after grow used: 328
       9 GroupedHashAggregateStream[0] about to shrink. shrink: 48 before 
shrink used: 328 after shrink used: 280
      10 GroupedHashAggregateStream[0] about to shrink. shrink: 40 before 
shrink used: 280 after shrink used: 240
      11 GroupedHashAggregateStream[0] about to shrink. shrink: 120 before 
shrink used: 240 after shrink used: 120
      12 Unregister: GroupedHashAggregateStream[0]
      13 GroupedHashAggregateStream[0] about to grow. additional: 348 before 
grow used: 0 after grow used: 468
      14 GroupedHashAggregateStream[0] about to grow. additional: 1248 before 
grow used: 0 after grow used: 1716
      15 GroupedHashAggregateStream[0] about to grow. additional: 120 before 
grow used: 0 after grow used: 1836
      16 GroupedHashAggregateStream[0] about to shrink. shrink: 36 before 
shrink used: 1836 after shrink used: 1800
      17 GroupedHashAggregateStream[0] about to grow. additional: 36 before 
grow used: 0 after grow used: 1836
      18 GroupedHashAggregateStream[0] about to shrink. shrink: 36 before 
shrink used: 1836 after shrink used: 1800
      19 GroupedHashAggregateStream[0] about to shrink. shrink: 12 before 
shrink used: 1800 after shrink used: 1788
      20 GroupedHashAggregateStream[0] about to shrink. shrink: 56 before 
shrink used: 1788 after shrink used: 1732
      21 GroupedHashAggregateStream[0] about to shrink. shrink: 1596 before 
shrink used: 1732 after shrink used: 136
      22 GroupedHashAggregateStream[0] about to shrink. shrink: 136 before 
shrink used: 136 after shrink used: 0
      23 Unregister: GroupedHashAggregateStream[0] 
   ```
   
   On line 5 and 6, two consumers register themselves with the same name 
([these 2 aggregation happen 
here](https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-plan/src/aggregates/mod.rs#L1567-L1588)].
   
   Seems like assuming each consumer to have a unique name is not correct.
   Another approach I can think of right now is to store a 
[`Weak`](https://doc.rust-lang.org/std/sync/struct.Weak.html) ref to the memory 
consumer but I believe that might change the MemoryPool Api.
   
   @alamb could you let me know if you have any suggestion on how to proceed 
forward?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to