Dandandan commented on issue #9941: URL: https://github.com/apache/datafusion/issues/9941#issuecomment-2425892039
No worries, just trying to understand the benefit of implementing this. So thinking more about it I think there might be two (I think smaller) improvements from implementing this compared to what we have now: * Right side can start executing earlier as soon as data on left side is available instead of waiting on *all partitions* to be loaded, can be helpful if data is not balanced and might utilize resources a bit more, e.g. if one side is IO bound and the other more CPU bound. * There is a bit more parallelism available, will be helpful if the inner plans can not be parallelized well (i.e. `1 < partitions < target_partitions`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
