EmilyMatt opened a new issue, #15126:
URL: https://github.com/apache/datafusion/issues/15126

   ### Is your feature request related to a problem or challenge?
   
   In order to have a chance at proper memory allocation in memory-contstrained 
environments, there has to be a way to differentiate between the different 
consumers, currently the default eq function simply compares the name and and 
spillable field, which for 3 consecutive operators of the same type(which run 
on the same partition), will return true.
   This leaves us with the ability to track memory only on a rudimentary level, 
such as counting the operators in the register() and unregister() functions.
   
   And even when that is said and done, a reservation can execute a 
new_empty(), pass itself to various merge streams after spilling, and there is 
still no way to differentiate between all the reservations other than to 
somehow track each one's usage? which feels dubious at best.
   
   ### Describe the solution you'd like
   
   Ideally, an `id: usize` field added to both the MemoryConsumer, and 
MemoryReservation, to identify them, as well as a corresponding `id()` function 
to retrieve it.
   The MemoryConsumer's id is a global id, and can be generated using:
   ```
   fn new_consumer_id() -> u64 {
           static ID: AtomicU64 = AtomicU64::new(0);
           ID.fetch_add(1, atomic::Ordering::Relaxed)
       }
   ```
   This ensures uniqueness for each created consumer.
   
   As for the memory reservation, its id should be unique but only on a 
consumer-level, so I suggest something like a current_reservation field on the 
Consumer
   ```
   current_reservation: Arc::new(AtomicU64::new(0)),
   ```
   
   and a method to get a new reservation id, which is called in the register, 
split, and new_empty functions
   ```
   pub fn new_reservation_id(&self) -> u64 {
           self.current_reservation.fetch_add(1, atomic::Ordering::Relaxed)
       }
   ```
   
   ### Describe alternatives you've considered
   
   I've tried creating a memory pool similar to the fairspill one, which counts 
the number of spillable and nonspillable consumers.
   I've tried maintaining a map of the current reservations by calling their 
size() function in each request and trying to match the requesting reservation 
with any one I currently have or inserting if it is a new one.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to