alamb commented on issue #14408:
URL: https://github.com/apache/datafusion/issues/14408#issuecomment-2639747760

   > Not sure if this is the right file to be looking at, but it's where the 
error comes from (`Physical input schema should be the same as the one 
converted from logical input schema`). There are only a handful of PRs that 
have touched `/datafusion/core/src/physical_planner.rs` since `44`:
   > 
   > 1. [chore: deprecate `ValuesExec` in favour of `MemoryExec` 
#14032](https://github.com/apache/datafusion/pull/14032)
   > 2. [Chore: refactor DataSink traits to avoid duplication 
#14121](https://github.com/apache/datafusion/pull/14121)
   > 3. [NestedLoopJoin Projection Pushdown 
#14120](https://github.com/apache/datafusion/pull/14120)
   > 4. [feat: Use `SchemaRef` in `JoinFilter` 
#14182](https://github.com/apache/datafusion/pull/14182)
   > 5. [Interface for physical plan invariant checking. 
#13986](https://github.com/apache/datafusion/pull/13986)
   
   It seems like the issue may be from 5 -- which now adds additional checks to 
catch errors earlier rather than later (for example now the error will happen 
during planning rather than when RecordBatchs are created during execution)
   
   So the question then in my mind is if the error is due to:
   1. Some code in Sail (and we just happen now to catch the error earlier)
   2. Some code in DataFusion
   
   > I think this should be allowed, especially since we run further 
optimizations on the physical plans.
   
   @findepi  do you mean we should relax the check to ignore nullable / non 
nullable annotations? -- I think that would probably be ok too.
   
   @wiedld  perhaps you have some comments here
   
   I think in general the split between what a Phyiscal optimizer is allowed to 
do / not (the "invariants" are not very clear). 
https://github.com/apache/datafusion/pull/13986 took a first step at 
formalizing them but maybe we have found other corner cases where it is not 
reasonable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to