milenkovicm commented on issue #17334:
URL: https://github.com/apache/datafusion/issues/17334#issuecomment-3393169247

   > Yes, when a non-spillable operator runs out of memory, it’s difficult to 
trigger spilling in another spillable operator to reclaim memory which seems to 
be a limitation currently.
   
   We need collaborative spilling like in spark, but I don't think current API 
can support it.
   
   > Have you ever observed any cases where a non-spillable operator showed a 
memory usage spike (for example, due to skewness or similar factors)?
   
   I cant really say, we just track maximum memory used by non-spillable 
operators
   
   > I wonder what would be the solution for these frequent failures on 
non-spillable operators - especially when other concurrent operators are 
spillable. If the memory usage of non-spillable operators can be roughly 
estimated before execution, do you think it would make sense to bypass or 
pre-reserve memory for them, instead of continuously growing the shared memory 
reservation along the non-spillable path?
   
   Tuning spillable and unspillable for me is like an equation with two 
unknowns, very hard to get it right as there is linear relation between two 
numbers. So we need to assume one variable as constant or add an additional 
equation. Adding an additional pool, one for spillable and another for 
non-spillable, like, I believe, @2010YOUY01 mentioned as well, will break 
relation between spillable and unspilable memory, simplifying things a bit, one 
variable can be approximated with a constant making relation with single 
variable. 
   
   Spillable pool is limited to trigger spill, unspillable may or may not be. 
If OS-level memory enforcement is used (cgroups), then unlimited unspillable 
may make sense. If we want to limit unspillable memory, doing a few empirical 
tests can give us rough estimation of. 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to