tgujar opened a new issue, #10337:
URL: https://github.com/apache/datafusion/issues/10337
### Describe the bug
Currently Datafusion will inline all CTE, a non-deterministic expression can
be executed multiple times producing different results
### To Reproduce
Consider the following query which uses the `aggregate_test_100` data from
datafusion-examples. Here, column c11 is a Float64
```
WITH cte as (
SELECT sum(c4 * c11) as total
FROM aggregate_test_100
GROUP BY c1)
SELECT total
FROM cte
WHERE total = (select max(total) from cte)
```
The optimized plan generated will inline the CTE and thus execute it twice
```
Projection: cte.total
Inner Join: cte.total = __scalar_sq_1.MAX(cte.total)
SubqueryAlias: cte
Projection: SUM(aggregate_test_100.c4 * aggregate_test_100.c11) AS
total
Aggregate: groupBy=[[aggregate_test_100.c1]],
aggr=[[SUM(CAST(aggregate_test_100.c4 AS Float64) * aggregate_test_100.c11)]]
TableScan: aggregate_test_100 projection=[c1, c4, c11]
SubqueryAlias: __scalar_sq_1
Aggregate: groupBy=[[]], aggr=[[MAX(cte.total)]]
SubqueryAlias: cte
Projection: SUM(aggregate_test_100.c4 * aggregate_test_100.c11) AS
total
Aggregate: groupBy=[[aggregate_test_100.c1]],
aggr=[[SUM(CAST(aggregate_test_100.c4 AS Float64) * aggregate_test_100.c11)]]
TableScan: aggregate_test_100 projection=[c1, c4, c11]
```
### Expected behavior
Since summation here is dependent on ordering, I believe it is incorrect to
inline the CTE here and execute it more than once.
### Additional context
Related issue, which talks about possible advantages on not inlining CTE in
some cases: https://github.com/apache/datafusion/issues/8777
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]