pantShrey commented on PR #21882: URL: https://github.com/apache/datafusion/pull/21882#issuecomment-4432242479
> > I am happy to defer to your judgment if you feel the tech debt must be addressed first! > > How about we try it in parallel? @alamb sure, i have already started to work on that locally while waiting for the response also i am actually still stuck on the `test repartition::test::test_preserve_order_with_spilling` If I keep the memory limit tight, the test panics immediately because the StreamingMerge operators and producer channels require a baseline of unspillable memory just to initialize their internal buffers. However, if I increase the limit to accommodate that initialization overhead and scale up the data size, the operator either fills new internal buffers (triggering OOM again) or processes everything in RAM without ever triggering a spill. I have tried shrinking the initial memory footprint by overriding the `SessionConfig` batch size, but it still runs into similar problem I would really appreciate any guidance on this, am I missing something obvious here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
