[
https://issues.apache.org/jira/browse/SPARK-57353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57353:
-----------------------------------
Labels: pull-request-available (was: )
> [Analyzer++] GROUPING SETS/CUBE/ROLLUP with HAVING or ORDER BY crashes with
> SparkUnsupportedOperationException
> --------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-57353
> URL: https://issues.apache.org/jira/browse/SPARK-57353
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.0.0
> Reporter: Anupam Yadav
> Priority: Major
> Labels: pull-request-available
>
> With `spark.sql.analyzer.singlePassResolver.enabled=true`, queries using
> GROUP BY CUBE/ROLLUP/GROUPING SETS with HAVING or ORDER BY containing
> aggregate functions crash with:
> {noformat}
> org.apache.spark.SparkUnsupportedOperationException:
> [UNSUPPORTED_CALL.WITHOUT_SUGGESTION]
> Cannot call the method "dataType$" of the class
> "org.apache.spark.sql.catalyst.expressions.BaseGroupingSets".
> SQLSTATE: 0A000
> {noformat}
> The single-pass resolver path invokes `assertValidAggregation` which calls
> `checkValidGroupingExprs` on sort/filter expressions. This function accesses
> `.dataType` on `BaseGroupingSets` expressions (Cube/Rollup/GroupingSets), but
> these expressions throw from their `dataType` method because they are meant
> to be expanded before type resolution.
> The legacy analyzer (default) handles all these correctly.
> *Repro:*
> {code:sql}
> -- All three variants crash with singlePassResolver enabled:
> -- Variant 1: CUBE + ORDER BY
> SELECT a, b, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
> GROUP BY CUBE(a, b) ORDER BY SUM(b);
> -- Variant 2: ROLLUP + HAVING
> SELECT a, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
> GROUP BY ROLLUP(a, b) HAVING SUM(b) > 25;
> -- Variant 3: GROUPING SETS + ORDER BY
> SELECT a, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
> GROUP BY GROUPING SETS ((a, b), (a), ()) ORDER BY SUM(b);
> {code}
> *Root cause:* `ExprUtils.checkValidGroupingExprs` (ExprUtils.scala:211) calls
> `.dataType` on `BaseGroupingSets` expressions before they have been expanded
> in the single-pass resolver path.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]