[ 
https://issues.apache.org/jira/browse/SPARK-9950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust updated SPARK-9950:
------------------------------------
    Assignee: Wenchen Fan

> Wrong Analysis Error for grouping/aggregating on struct fields
> --------------------------------------------------------------
>
>                 Key: SPARK-9950
>                 URL: https://issues.apache.org/jira/browse/SPARK-9950
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Michael Armbrust
>            Assignee: Wenchen Fan
>            Priority: Blocker
>
> Spark 1.4:
> {code}
> import org.apache.spark.sql.functions._
> val df = Seq(("x", (1,1)), ("y", (2, 2))).toDF("a", "b")
> df.groupBy("b._1").agg(sum("b._2")).collect()
> df: org.apache.spark.sql.DataFrame = [a: string, b: struct<_1:int,_2:int>]
> res0: Array[org.apache.spark.sql.Row] = Array([1,1], [2,2])
> {code}
> Spark 1.5
> {code}
> org.apache.spark.sql.AnalysisException: expression 'b' is neither present in 
> the group by, nor is it an aggregate function. Add to group by or wrap in 
> first() if you don't care which value you get.;
>       at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37)
>       at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
>       at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:110)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to