Yuming Wang created SPARK-28386: ----------------------------------- Summary: Cannot resolve ORDER BY columns with GROUP BY and HAVING Key: SPARK-28386 URL: https://issues.apache.org/jira/browse/SPARK-28386 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: Yuming Wang
How to reproduce: {code:sql} CREATE TABLE test_having (a int, b int, c string, d string) USING parquet; INSERT INTO test_having VALUES (0, 1, 'XXXX', 'A'); INSERT INTO test_having VALUES (1, 2, 'AAAA', 'b'); INSERT INTO test_having VALUES (2, 2, 'AAAA', 'c'); INSERT INTO test_having VALUES (3, 3, 'BBBB', 'D'); INSERT INTO test_having VALUES (4, 3, 'BBBB', 'e'); INSERT INTO test_having VALUES (5, 3, 'bbbb', 'F'); INSERT INTO test_having VALUES (6, 4, 'cccc', 'g'); INSERT INTO test_having VALUES (7, 4, 'cccc', 'h'); INSERT INTO test_having VALUES (8, 4, 'CCCC', 'I'); INSERT INTO test_having VALUES (9, 4, 'CCCC', 'j'); SELECT lower(c), count(c) FROM test_having GROUP BY lower(c) HAVING count(*) > 2 ORDER BY lower(c); {code} {noformat} spark-sql> SELECT lower(c), count(c) FROM test_having > GROUP BY lower(c) HAVING count(*) > 2 > ORDER BY lower(c); Error in query: cannot resolve '`c`' given input columns: [lower(c), count(c)]; line 3 pos 19; 'Sort ['lower('c) ASC NULLS FIRST], true +- Project [lower(c)#158, count(c)#159L] +- Filter (count(1)#161L > cast(2 as bigint)) +- Aggregate [lower(c#7)], [lower(c#7) AS lower(c)#158, count(c#7) AS count(c)#159L, count(1) AS count(1)#161L] +- SubqueryAlias test_having +- Relation[a#5,b#6,c#7,d#8] parquet {noformat} But it works when setting an alias: {noformat} spark-sql> SELECT lower(c) withAias, count(c) FROM test_having > GROUP BY lower(c) HAVING count(*) > 2 > ORDER BY withAias; bbbb 3 cccc 4 {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org