adriangb opened a new issue, #17114: URL: https://github.com/apache/datafusion/issues/17114
While working on https://github.com/apache/datafusion/pull/16589 we came to the realization that there is now 2 paths of casting / adaptation logic: 1. `SchemaAdapter` which now supports nested structs as of https://github.com/apache/datafusion/pull/16371 2. The `Cast` expr (i.e. `select 1::text` in SQL or implicit casts) which uses the arrow cast kernel which does _not_ support nested structs and such It would be good to unify these. There was discussion of this very point in https://github.com/apache/arrow-rs/issues/7176 and one thing that came up was to have arrow develop some sort of `SchemaAdapter` for itself. One of the important issues to consider here in terms of performance, and maybe something to have a broader discussion on, is that one of the advantages of SchemaAdapter is that it can pre-compute the work to do be done and then avoid any sort of introspection in the hot path. This is not possible with a PhysicalExpr. Thus I would like to propose the following rough course of action: 1. Unify the code paths, this can be something as naive as dynamically building a `SchemaAdapter` each time a `Cast` PhysicalExpr gets called or could be something like refactoring the code to be shared. 2. Think about some sort of `PhysicalExpr::optimize(inputs)` that can in this case pre-compute the needed casts and build efficient data structures to apply those in a loop. I think this could benefit a lot of other expressions as well that need to do prep work for each execution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org