findepi commented on issue #12720: URL: https://github.com/apache/datafusion/issues/12720#issuecomment-2668289411
> As a general rule we don't support operations on heterogenous types to avoid the combinatorial explosion of codegen that would result, and the corresponding impact on build times and binary size. I guess we're in agreement that this pertains only to the lowest-level operations exposed by arrow. Exploding codegen is not the only way to support runtime-adaptive data representation, but this runtime-adaptivity needs to end somewhere. We can decide where it is terminated. If it's terminated inside arrow kernels, we should expect binary code bloat. > Where/when this coercion takes is a question for DF, but IMO I would expect the physical plan to be in terms of the physical arrow types, with the logical plan potentially in terms of some higher level logical datatype This is definitely an option. This is what is intended by https://github.com/apache/datafusion/issues/12622 (cc @notfilippo , @tobixdev). When creating this issue as a separate one, i intended to go further and have adaptivity at _runtime_. Often, data flowing from two different branches of UNION ALL doesn't need to be unified at all. Maybe it is a premature idea, given than we're not done with https://github.com/apache/datafusion/issues/12622 yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org