xuanyuanking opened a new pull request #28397:
URL: https://github.com/apache/spark/pull/28397
### What changes were proposed in this pull request?
Add a new logical node AggregateWithHaving, and the parser should create
this plan for HAVING. The analyzer resolves it to Filter(..., Aggregate(...)).
### Why are the changes needed?
The SQL parser in Spark creates Filter(..., Aggregate(...)) for the HAVING
query, and Spark has a special analyzer rule ResolveAggregateFunctions to
resolve the aggregate functions and grouping columns in the Filter operator.
It works for simple cases in a very tricky way as it relies on rule
execution order:
1. Rule ResolveReferences hits the Aggregate operator and resolves
attributes inside aggregate functions, but the function itself is still
unresolved as it's an UnresolvedFunction. This stops resolving the Filter
operator as the child Aggrege operator is still unresolved.
2. Rule ResolveFunctions resolves UnresolvedFunction. This makes the Aggrege
operator resolved.
3. Rule ResolveAggregateFunctions resolves the Filter operator if its child
is a resolved Aggregate. This rule can correctly resolve the grouping columns.
In the example query, I put a CAST, which needs to be resolved by rule
ResolveTimeZone, which runs after ResolveAggregateFunctions. This breaks step 3
as the Aggregate operator is unresolved at that time. Then the analyzer starts
next round and the Filter operator is resolved by ResolveReferences, which
wrongly resolves the grouping columns.
See the demo below:
```
SELECT SUM(a) AS b, '2020-01-01' AS fake FROM VALUES (1, 10), (2, 20) AS
T(a, b) GROUP BY b HAVING b > 10
```
The query's result is
```
+---+----------+
| b| fake|
+---+----------+
| 2|2020-01-01|
+---+----------+
```
But if we add CAST, it will return an empty result.
```
SELECT SUM(a) AS b, CAST('2020-01-01' AS DATE) AS fake FROM VALUES (1, 10),
(2, 20) AS T(a, b) GROUP BY b HAVING b > 10
```
### Does this PR introduce any user-facing change?
Yes, bug fix for cast in having aggregate expressions.
### How was this patch tested?
New UT added.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]