Linhong Liu created SPARK-32816: ----------------------------------- Summary: Planner error when aggregating multiple distinct DECIMAL columns Key: SPARK-32816 URL: https://issues.apache.org/jira/browse/SPARK-32816 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: Linhong Liu
Running different DISTINCT decimal aggregations causes a query planner error: {code:java} java.lang.RuntimeException: You hit a query analyzer bug. Please report your query to Spark user mailing list. at scala.sys.package$.error(package.scala:30) at org.apache.spark.sql.execution.SparkStrategies$Aggregation$.apply(SparkStrategies.scala:473) at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:67) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:97) at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:74) at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:82) at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:162) at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:162) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) {code} example failing query {code:java} import org.apache.spark.util.Utils // Changing decimal(9, 0) to decimal(8, 0) fixes the problem. Root cause seems to have to do with // UnscaledValue being used in one of the expressions but not the other. val df = spark.range(0, 50000, 1, 1).selectExpr( "id", "cast(id as decimal(9, 0)) as ss_ext_list_price") val cacheDir = Utils.createTempDir().getCanonicalPath df.write.parquet(cacheDir) spark.read.parquet(cacheDir).createOrReplaceTempView("test_table") spark.sql(""" select avg(distinct ss_ext_list_price), sum(distinct ss_ext_list_price) from test_table""").explain {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org