jiaan.geng created SPARK-30182: ---------------------------------- Summary: Support nested aggregates Key: SPARK-30182 URL: https://issues.apache.org/jira/browse/SPARK-30182 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.0 Reporter: jiaan.geng
Spark SQL cannot supports a SQL with nested aggregate as below: {code:java} SELECT sum(salary), row_number() OVER (ORDER BY depname), sum( sum(salary) FILTER (WHERE enroll_date > '2007-01-01') ) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS "filtered_sum", depname FROM empsalary GROUP BY depname;{code} And Spark will throw exception as follows: {code:java} org.apache.spark.sql.AnalysisException It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.{code} But PostgreSQL supports this syntax. {code:java} SELECT sum(salary), row_number() OVER (ORDER BY depname), sum( sum(salary) FILTER (WHERE enroll_date > '2007-01-01') ) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS "filtered_sum", depname FROM empsalary GROUP BY depname; sum | row_number | filtered_sum | depname -------+------------+--------------+----------- 25100 | 1 | 22600 | develop 7400 | 2 | 3500 | personnel 14600 | 3 | | sales (3 rows){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org