comphead commented on issue #23194: URL: https://github.com/apache/datafusion/issues/23194#issuecomment-4858283179
Thanks @avantgardnerio, what is the scope for this AQE? Currently discussing datafusion and distributed variants. I have some feeling that support AQE for datafusion single machine engine and distributed variants are totally different amount of work. For distributed the stats synchronization is supposed to be more complicated, it would require some coordinator that knows about all hash sizes, allocations, stats, etc. Also to insert pipeline breakers the distributed prob should support some sort of stages like Spark. Is the scope to come up with some universal trait/approach that would shape the AQE for single machine and distributed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
