findepi commented on issue #12720:
URL: https://github.com/apache/datafusion/issues/12720#issuecomment-2668289411

   
   > As a general rule we don't support operations on heterogenous types to 
avoid the combinatorial explosion of codegen that would result, and the 
corresponding impact on build times and binary size.
   
   I guess we're in agreement that this pertains only to the lowest-level 
operations exposed by arrow.
   Exploding codegen is not the only way to support runtime-adaptive data 
representation, but this runtime-adaptivity needs to end somewhere. We can 
decide where it is terminated. If it's terminated inside arrow kernels, we 
should expect binary code bloat.
   
   
   
   
   
   > Where/when this coercion takes is a question for DF, but IMO I would 
expect the physical plan to be in terms of the physical arrow types, with the 
logical plan potentially in terms of some higher level logical datatype
   
   This is definitely an option. This is what is intended by 
https://github.com/apache/datafusion/issues/12622 (cc @notfilippo , @tobixdev).
   When creating this issue as a separate one, i intended to go further and 
have adaptivity at  _runtime_.
   Often, data flowing from two different branches of UNION ALL doesn't need to 
be unified at all. 
   
   Maybe it is a premature idea, given than we're not done with 
https://github.com/apache/datafusion/issues/12622 yet.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to