EmilyMatt opened a new issue, #15126:
URL: https://github.com/apache/datafusion/issues/15126
### Is your feature request related to a problem or challenge?
In order to have a chance at proper memory allocation in memory-contstrained
environments, there has to be a way to differentiate between the different
consumers, currently the default eq function simply compares the name and and
spillable field, which for 3 consecutive operators of the same type(which run
on the same partition), will return true.
This leaves us with the ability to track memory only on a rudimentary level,
such as counting the operators in the register() and unregister() functions.
And even when that is said and done, a reservation can execute a
new_empty(), pass itself to various merge streams after spilling, and there is
still no way to differentiate between all the reservations other than to
somehow track each one's usage? which feels dubious at best.
### Describe the solution you'd like
Ideally, an `id: usize` field added to both the MemoryConsumer, and
MemoryReservation, to identify them, as well as a corresponding `id()` function
to retrieve it.
The MemoryConsumer's id is a global id, and can be generated using:
```
fn new_consumer_id() -> u64 {
static ID: AtomicU64 = AtomicU64::new(0);
ID.fetch_add(1, atomic::Ordering::Relaxed)
}
```
This ensures uniqueness for each created consumer.
As for the memory reservation, its id should be unique but only on a
consumer-level, so I suggest something like a current_reservation field on the
Consumer
```
current_reservation: Arc::new(AtomicU64::new(0)),
```
and a method to get a new reservation id, which is called in the register,
split, and new_empty functions
```
pub fn new_reservation_id(&self) -> u64 {
self.current_reservation.fetch_add(1, atomic::Ordering::Relaxed)
}
```
### Describe alternatives you've considered
I've tried creating a memory pool similar to the fairspill one, which counts
the number of spillable and nonspillable consumers.
I've tried maintaining a map of the current reservations by calling their
size() function in each request and trying to match the requesting reservation
with any one I currently have or inserting if it is a new one.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]