dilipbiswal opened a new pull request #24209: [SPARK-27255] Aggregate functions should not be allowed in WHERE URL: https://github.com/apache/spark/pull/24209 ## What changes were proposed in this pull request? In the PR, we raise an AnalysisError when we detect the presense of aggregate expressions in where clause. Here is the problem description from the JIRA. Aggregate functions should not be allowed in WHERE clause. But Spark SQL throws an exception when generating codes. It is supposed to throw an exception during parsing or analyzing. Here is an example: ``` val df = spark.sql("select * from t where sum(ta) > 0") df.explain(true) df.show() ``` Resulting exception: ``` Exception in thread "main" java.lang.UnsupportedOperationException: Cannot generate code for expression: sum(cast(input[0, int, false] as bigint)) at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode(Expression.scala:291) at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode$(Expression.scala:290) at org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression.doGenCode(interfaces.scala:87) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:138) at scala.Option.getOrElse(Option.scala:138) ``` Checked the behaviour of other database and all of them return an exception: **Postgress** ``` select * from foo where max(c1) > 0; Error ERROR: aggregate functions are not allowed in WHERE Position: 25 ``` **DB2** ``` db2 => select * from foo where max(c1) > 0; SQL0120N Invalid use of an aggregate function or OLAP function. ``` **Oracle** ``` select * from foo where max(c1) > 0; ORA-00934: group function is not allowed here ``` **MySql** ``` select * from foo where max(c1) > 0; Invalid use of group function ``` ## How was this patch tested? Added tests in AnalysisErrorSuite and group-by.sql
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
