wiedld commented on code in PR #12142: URL: https://github.com/apache/datafusion/pull/12142#discussion_r1730641937
########## datafusion/expr/src/logical_plan/builder.rs: ########## @@ -1331,7 +1331,17 @@ pub fn validate_unique_names<'a>( }) } -/// Union two logical plans. +/// Union two [`LogicalPlan`]s. +/// +/// Constructs the UNION plan, but does not perform type-coercion. Therefore the +/// subtree expressions will not be properly typed until the optimizer pass. Review Comment: I think maybe this depends on how users construct their own logical plans? And if they are already ensuring type correctness in their own code? Our logical plan construction relied upon the [previous behavior](https://github.com/influxdata/arrow-datafusion/commit/afa23abc46059e100061295b619b7b66fbc39625) of the `union()` api, which was immediate type coercion. IMO - this was not an explicit api contract, rather an implementation detail. Regardless, we use the typed-coerced union schema & expressions when adding other logical nodes (e.g. limit, sort, group by). Once the behavior of the api changed, we started building bad plans -- which produced incorrect results and in one case errored. I filed this ticket as a doc enhancement, not a bug, since I didn't think(?) type coercion was part of the api contract. My hope was to (a) make it clear when/how unions can be type coerced, and (b) make public the APIs to do so. In other words, make it easier for users to ensure type correctness without the optimizer pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org