Sandor Murakozi created SPARK-22804: ---------------------------------------
Summary: Using a window function inside of an aggregation causes StackOverflowError Key: SPARK-22804 URL: https://issues.apache.org/jira/browse/SPARK-22804 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0 Reporter: Sandor Murakozi {code:scala} import org.apache.spark.sql.expressions.Window val df = Seq(("a", 1), ("a", 2), ("b", 3)).toDF("key", "value") df.select(min(avg('value).over(Window.partitionBy('key)))).show {code} produces {code} java.lang.StackOverflowError at org.apache.spark.sql.catalyst.trees.TreeNode.find(TreeNode.scala:106) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$find$1$$anonfun$apply$1.apply(TreeNode.scala:109) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$find$1$$anonfun$apply$1.apply(TreeNode.scala:109) at scala.Option.orElse(Option.scala:289) ... at org.apache.spark.sql.catalyst.trees.TreeNode.find(TreeNode.scala:109) at org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions$.org$apache$spark$sql$catalyst$analysis$Analyzer$ExtractWindowExpressions$$hasWindowFunction(Analyzer.scala:1853) at org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions$$anonfun$65.apply(Analyzer.scala:1877) at org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions$$anonfun$65.apply(Analyzer.scala:1877) at scala.collection.TraversableLike$$anonfun$partition$1.apply(TraversableLike.scala:314) at scala.collection.TraversableLike$$anonfun$partition$1.apply(TraversableLike.scala:314) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.partition(TraversableLike.scala:314) at scala.collection.AbstractTraversable.partition(Traversable.scala:104) at org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions$.org$apache$spark$sql$catalyst$analysis$Analyzer$ExtractWindowExpressions$$extract(Analyzer.scala:1877) at org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions$$anonfun$apply$27.applyOrElse(Analyzer.scala:2060) at org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions$$anonfun$apply$27.applyOrElse(Analyzer.scala:2021) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) {code} ... {code:scala} df.select(min(avg('value).over())).show {code} produces a similar error. {code:scala} df.select(min(avg('value))).show org.apache.spark.sql.AnalysisException: It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.;; ... df.select(min(avg('value)).over()).show +---------------------------------------+ |min(avg(value)) OVER (UnspecifiedFrame)| +---------------------------------------+ | 2.0| +---------------------------------------+ {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org