pantShrey commented on PR #21882:
URL: https://github.com/apache/datafusion/pull/21882#issuecomment-4432242479

   > > I am happy to defer to your judgment if you feel the tech debt must be 
addressed first!
   > 
   > How about we try it in parallel?
   
   @alamb  sure, i have already started to work on that locally while waiting 
for the response 
   
   also i am actually still stuck on the `test 
repartition::test::test_preserve_order_with_spilling`
   
   If I keep the memory limit tight, the test panics immediately because the 
StreamingMerge operators and producer channels require a baseline of 
unspillable memory just to initialize their internal buffers. However, if I 
increase the limit to accommodate that initialization overhead and scale up the 
data size, the operator either fills new internal buffers (triggering OOM 
again) or processes everything in RAM without ever triggering a spill. I have 
tried shrinking the initial memory footprint by overriding the `SessionConfig` 
batch size, but it still runs into similar problem
   
   I would really appreciate any guidance on this, am I missing something 
obvious here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to