2010YOUY01 commented on PR #23233:
URL: https://github.com/apache/datafusion/pull/23233#issuecomment-4862144163

   > The only thing I found (with codex) is that 
`PartialReduceHashAggregateStream` is also used regardless of the memory pool 
sizes even though it doesn't implement memory reservations / OOM behavior yet
   > 
   > So I suggest we disable this special stream when there is a memory limit 
-- similarly to Partial hash stream guard:
   
   Thanks for the review. I agree we should not use the new path if there is 
the memory limit.
   
   One thing I noticed is that the original PR added spilling and streaming 
merge support for PartialReduce aggregation, but I don’t see any tests for the 
PartialReduce + memory limit case, or much discussion about the expected OOM 
behavior.
   
   * https://github.com/apache/datafusion/pull/20019
   
   For this use case, I think early emit is probably both more efficient and 
simpler than supporting spilling / streaming merge.
   
   Should we make this path explicitly error on OOM instead? That would keep 
the behavior more constrained for now. Since the existing feature does not seem 
to have test coverage, relying on the current implementation feels a bit risky, 
and we can reopen the discussion later if users depend on this behavior. 🤔
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to