[jira] [Resolved] (SPARK-4564) SchemaRDD.groupBy(groupingExprs)(aggregateExprs) doesn't return the groupingExprs as part of the output schema

Michael Armbrust (JIRA) Fri, 19 Dec 2014 13:05:38 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael Armbrust resolved SPARK-4564.
-------------------------------------
    Resolution: Won't Fix

I'm going to close this wontfix unless there is major objection.  Happy to 
accept PRs to clarify the documentation though :)

> SchemaRDD.groupBy(groupingExprs)(aggregateExprs) doesn't return the 
> groupingExprs as part of the output schema
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4564
>                 URL: https://issues.apache.org/jira/browse/SPARK-4564
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.1.0
>         Environment: Mac OSX, local mode, but should hold true for all 
> environments
>            Reporter: Dean Wampler
>
> In the following example, I would expect the "grouped" schema to contain two 
> fields, the String name and the Long count, but it only contains the Long 
> count.
> {code}
> // Assumes val sc = new SparkContext(...), e.g., in Spark Shell
> import org.apache.spark.sql.{SQLContext, SchemaRDD}
> import org.apache.spark.sql.catalyst.expressions._
> val sqlc = new SQLContext(sc)
> import sqlc._
> case class Record(name: String, n: Int)
> val records = List(
>   Record("three",   1),
>   Record("three",   2),
>   Record("two",     3),
>   Record("three",   4),
>   Record("two",     5))
> val recs = sc.parallelize(records)
> recs.registerTempTable("records")
> val grouped = recs.select('name, 'n).groupBy('name)(Count('n) as 'count)
> grouped.printSchema
> // root
> //  |-- count: long (nullable = false)
> grouped foreach println
> // [2]
> // [3]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-4564) SchemaRDD.groupBy(groupingExprs)(aggregateExprs) doesn't return the groupingExprs as part of the output schema

Reply via email to