alamb commented on issue #14408: URL: https://github.com/apache/datafusion/issues/14408#issuecomment-2639747760
> Not sure if this is the right file to be looking at, but it's where the error comes from (`Physical input schema should be the same as the one converted from logical input schema`). There are only a handful of PRs that have touched `/datafusion/core/src/physical_planner.rs` since `44`: > > 1. [chore: deprecate `ValuesExec` in favour of `MemoryExec` #14032](https://github.com/apache/datafusion/pull/14032) > 2. [Chore: refactor DataSink traits to avoid duplication #14121](https://github.com/apache/datafusion/pull/14121) > 3. [NestedLoopJoin Projection Pushdown #14120](https://github.com/apache/datafusion/pull/14120) > 4. [feat: Use `SchemaRef` in `JoinFilter` #14182](https://github.com/apache/datafusion/pull/14182) > 5. [Interface for physical plan invariant checking. #13986](https://github.com/apache/datafusion/pull/13986) It seems like the issue may be from 5 -- which now adds additional checks to catch errors earlier rather than later (for example now the error will happen during planning rather than when RecordBatchs are created during execution) So the question then in my mind is if the error is due to: 1. Some code in Sail (and we just happen now to catch the error earlier) 2. Some code in DataFusion > I think this should be allowed, especially since we run further optimizations on the physical plans. @findepi do you mean we should relax the check to ignore nullable / non nullable annotations? -- I think that would probably be ok too. @wiedld perhaps you have some comments here I think in general the split between what a Phyiscal optimizer is allowed to do / not (the "invariants" are not very clear). https://github.com/apache/datafusion/pull/13986 took a first step at formalizing them but maybe we have found other corner cases where it is not reasonable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org