adriangb opened a new pull request, #18014:
URL: https://github.com/apache/datafusion/pull/18014

   Addresses 
https://github.com/apache/datafusion/issues/17334#issuecomment-3237651689
   
   I ran into this using `datafusion-distributed` which I think makes the issue 
of partition execution time skew even more likely to happen. As per that issue 
it can also happen with non-distributed queries, e.g. if one partition's sort 
spills and others don't. 
   
   Due to the nature of `ReparitionExec` I don't think we can bound the 
channels, that could lead to deadlocks. So what I did was at least make queries 
that would have previously fail continue forward with disk spilling. I did not 
account for memory usage when reading batches back from disk since DataFusion 
in general does not generally account for "in-flight" batches.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to