Lukas-Grasmann opened a new pull request, #36378: URL: https://github.com/apache/spark/pull/36378
### What changes were proposed in this pull request? Provide a way to resolve aggregates in `Sort` nodes if the query also contains `HAVING` by: * Preventing premature `Project` nodes before sorting * Allow resolving aggregates in `Sort` nodes even if there is a `Filter` node (introduced by `HAVING`) between the `Sort` and the `Aggregate` ### Why are the changes needed? Resolve aggregate correctly in sorting/ordering nodes in plan even if the query contains `HAVING`. ### Does this PR introduce _any_ user-facing change? Queries that contain aggregates, `HAVING`, and sorting/ordering should now resolve correctly, and work as expected. Examples (see SPARK-39022): ``` SELECT hotel FROM test GROUP BY hotel HAVING sum(price) > 150 ORDER BY sum(price) SELECT hotel, sum(price) FROM test GROUP BY hotel HAVING sum(price) > 150 ORDER BY sum(price) ``` ### How was this patch tested? Manual testing of examples provided in SPARK-39022. Additional similar unit tests added in `AnalysisSuite`. Run unit test: ``` $ build/sbt "catalyst/testOnly org.apache.spark.sql.catalyst.analysis.AnalysisSuite -- -z SPARK-39022" ``` Affected modified tests (see SPARK-39022): ``` $ build/sbt "sql/testOnly *TPCDSV2_7_PlanStabilitySuite*" $ build/sbt "sql/testOnly *TPCDSV2_7_PlanStabilityWithStatsSuite*" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
