Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22696#discussion_r224457317
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
---
@@ -108,7 +108,7 @@ class PlanParserSuite extends AnalysisTest {
assertEqual("select a, b from db.c where x < 1", table("db",
"c").where('x < 1).select('a, 'b))
assertEqual(
"select a, b from db.c having x < 1",
- table("db", "c").select('a, 'b).where('x < 1))
+ table("db", "c").groupBy()('a, 'b).where('x < 1))
--- End diff --
Is this query legal? Can we run such query in a test?
I read the articles
[here](https://blog.jooq.org/2014/12/04/do-you-really-understand-sqls-group-by-and-having-clauses/)
and
[here](https://stackoverflow.com/questions/5496786/having-clause-in-postgresql/5496829#5496829).
One point gets my attention. Below is Postgres documentation about `HAVING`
without `GROUP BY`:
> The presence of HAVING turns a query into a grouped query even if there
is no GROUP BY clause. This is the same as what happens when the query contains
aggregate functions but no GROUP BY clause. All the selected rows are
considered to form a single group, and **the SELECT list and HAVING clause can
only reference table columns from within aggregate functions**. Such a query
will emit a single row if the HAVING condition is true, zero rows if it is not
true.
Please see the bold text. Seems to me in this query, we can't have `x < 1`
as condition in `HAVING` because `x` is not within aggregate functions. ditto
for `a` and `b` in `SELECT` list.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]