cetra3 opened a new issue, #20715:
URL: https://github.com/apache/datafusion/issues/20715

   ### Is your feature request related to a problem or challenge?
   
   Right now in Datafusion there are bugs when you are sorting & spilling with 
multiple partitions.  There is even a comment that on the multi level merge 
that this is not a great solution:
   
   
https://github.com/apache/datafusion/blob/d412ba5d182a7ab44792647de33f7fb2b58efc27/datafusion/physical-plan/src/sorts/multi_level_merge.rs#L361-L362
   
   There is also some contention between *when* we should spill and when we are 
out of memory.  Both of these are essentially two limits on the memory cap.
   
   After we fix memory accounting via solutions in 
https://github.com/apache/datafusion/issues/20714, we should be able to adjust 
the memory pool implementation to allow for more memory coordination.
   
   ### Describe the solution you'd like
   
   Implement a new "memory coordinator" infrastructure which has the following 
properties:
   
   * Partition Aware Scheduling: if there are multiple partitions across one or 
more active queries, we should ensure that we fairly allocate memory according 
to partitions
   * High and Low memory watermarks: we should have two thresholds for our 
memory limits.  The first is the low watermark: when things should *start* to 
spill, and high watermark: when we should error out because we are at max 
memory and there is no way to continue without blowing out memory allocations
   * Ability to "wait" for memory to be free: rather than error out immediately 
if there is no memory, have the ability to wait until memory is available.   
This would mean that we would need some sort of method to wait until memory is 
available.  However: this is deadlock prone so we need to be careful, or have 
it as an opt-in.  I.e, have the default behaviour to error out immediately, but 
allow users of datafusion the ability to wait around for a bit
   
   
   I have experimented with all of these style properties in this branch: 
https://github.com/apache/datafusion/compare/main...pydantic:datafusion:memory_observations
 but it was vibe coded to prove the concept (which it does), but probably needs 
to be broken out into discrete chunks of work and more thoroughly thought out.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to