findepi commented on PR #13117: URL: https://github.com/apache/datafusion/pull/13117#issuecomment-2439412500
This is good check to have. 👍 Obviously having same column count doesn't guarantee compatibility. The types need to be coercible, so normally the column count check should go together with the type check. Reading rationale for https://github.com/apache/datafusion/pull/11961 it seems that we have two+ phases: 1. "loose" Logical Plans where types may not match - this is apparently the phase what plan builder operates at, because it directly supports dataframe frontend, I believe - then we do "type coercion optimizer" (which is not really an optimizer) 2. "strict" Logical Plans where types are expected to match The consequence of this is that - it's unclear what guarantees are provided at which phase, because phases do not stand out in the code - we can still "type coercion optimizer" and other coercion related functions (like expr.get_type) on strict Logical Plans which cannot benefit from them I would want to see these addressed, that's the idea of https://github.com/apache/datafusion/issues/12723. Maybe we should have a form of validator of logical plan well-formedness (union branches match, the types match _exactly_) and the check would go there. The downside of such approach (and of current state as well) is that, when you have code taking Logical Plan UNION, you can't really assume much (things may or may not have been checked), so the code needs to be very defensively written. Wonder whether we can leverage Rust's compiler type system to help us with this. cc @alamb @ozankabak @jonahgao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
