pepijnve commented on issue #17801: URL: https://github.com/apache/datafusion/issues/17801#issuecomment-3497492805
To summarise for any newcomer, the root cause of this issue is that 1. `IFNULL(x, y)` gets simplified to `CASE WHEN x IS NOT NULL THEN x ELSE y END` 2. `IFNULL(x, y)` reports itself as `nullable? false` 3. `CASE WHEN x IS NOT NULL THEN x ELSE y END` reports itself as `nullable? true` After step 1, the logical schema of the query still has `nullable? false` for the column. When translating from logical to physical, the physical expression also reports `nullable? true` and the planner errors out due to the mismatch between logical and physical nullability. There are a couple of ways to fix this: 1. Assume that scalar UDF simplification is not allowed to change the schema in any way. The implication is that the `CASE` expression in this example must also return `nullable? false`. Adapt `is_nullable` for `CASE` to make this work 2. Allow scalar UDF simplification to change the logical schema. Adjust the code where the simplification is happening so that the schema does not get out of sync with the actual expressions. Option 1 feels like the more correct approach, but is rather tricky to implement; particularly for the logical expression. This requires constant evaluation of the 'when' expressions of the case expression from code that's located in the `expr` crate. Const evaluation uses physical expression evaluation, but that's not accessible from `expr`. The alternative is to emulate evaluation as best as possible. An attempt to implement this can be found in https://github.com/apache/datafusion/pull/17813. Option 2 is the easy quick fix, but doesn't feel correct. You wouldn't want an expression to change from not-nullable to nullable as part of optimisation. It might also invalidate assumptions that earlier optimisation passes made regarding nullability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
