suibianwanwank opened a new pull request, #4375: URL: https://github.com/apache/calcite/pull/4375
This PR combines an existing implementation of Calcite with the paper [Improving Unnesting of Complex Queries](https://15799.courses.cs.cmu.edu/spring2025/papers/11-unnesting/neumann-btw2025.pdf) to fix the classic count-bug when decorrelated subqueries. For more information, see [CALCITE-7010]. Pseudo-code given in the paper: ``` 1 fun unnest(groupby, info, accessing): 2 static = groupby.groups is empty or groups.groupingsets contains ∅ 3 unnest(groupby.input, info, accessing) 4 rewriteColumns(groupby, info) 5 for c in info.outerRefs: 6 add info.repr[c] to groupby.groups 7 if static: 8 replace groupby with info.D groupy, joining on mapped info.outerRefs ``` This fix affects a lot of existing test plans, and I haven't yet fully fixed it and evaluated its impact. However, based on the sub-query.iq tests, the results are very promising. I'll set this PR to draft until all tests are fixed. Any insight on this would be appreciated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
