Bruce Robbins created SPARK-38528: ------------------------------------- Summary: NullPointerException when selecting a generator in a Stream of aggregate expressions Key: SPARK-38528 URL: https://issues.apache.org/jira/browse/SPARK-38528 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.1, 3.1.3, 3.3.0 Reporter: Bruce Robbins
Assume this dataframe: {noformat} val df = Seq(1, 2, 3).toDF("v") {noformat} This works: {noformat} df.select(Seq(explode(array(min($"v"), max($"v"))), sum($"v")): _*).collect {noformat} However, this doesn't: {noformat} df.select(Stream(explode(array(min($"v"), max($"v"))), sum($"v")): _*).collect {noformat} It throws this error: {noformat} java.lang.NullPointerException at org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates$.$anonfun$containsAggregates$1(Analyzer.scala:2516) at scala.collection.immutable.List.flatMap(List.scala:366) at org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates$.containsAggregates(Analyzer.scala:2515) at org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates$$anonfun$apply$31.applyOrElse(Analyzer.scala:2509) at org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates$$anonfun$apply$31.applyOrElse(Analyzer.scala:2508) {noformat} The only difference between the two queries is that the first one uses {{Seq}} to specify the varargs, whereas the second one uses {{Stream}}. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org