[ 
https://issues.apache.org/jira/browse/SPARK-16474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela closed SPARK-16474.
-----------------------------
    Resolution: Not A Problem

It seems as if the right way to use the agg() API directly on Dataset/DataFrame 
is with select() or providing an implicit conversion.

> Global Aggregation doesn't seem to work at all 
> -----------------------------------------------
>
>                 Key: SPARK-16474
>                 URL: https://issues.apache.org/jira/browse/SPARK-16474
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.6.2, 2.0.0
>            Reporter: Amit Sela
>
> Executing a global aggregation (not grouped by key) fails.
> Take the following code for example:
> {code}
>     val session = SparkSession.builder()
>                               .appName("TestGlobalAggregator")
>                               .master("local[*]")
>                               .getOrCreate()
>     import session.implicits._
>     val ds1 = List(1, 2, 3).toDS
>     val ds2 = ds1.agg(
>           new Aggregator[Int, Int, Int]{
>       def zero: Int = 0
>       def reduce(b: Int, a: Int): Int = b + a
>       def merge(b1: Int, b2: Int): Int = b1 + b2
>       def finish(reduction: Int): Int = reduction
>       def bufferEncoder: Encoder[Int] = implicitly[Encoder[Int]]
>       def outputEncoder: Encoder[Int] = implicitly[Encoder[Int]]
>     }.toColumn)
>     ds2.printSchema
>     ds2.show
> {code}
> I would expect the result to be 6, but instead I get the following exception:
> {noformat}
> java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast 
> to java.lang.Integer 
> at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
> .........
> {noformat}
> Trying the same code on DataFrames in 1.6.2 results in:
> {noformat}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: unresolved 
> operator 'Aggregate [(anon$1(),mode=Complete,isDistinct=false) AS 
> anon$1()#8]; 
> at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
> ..........
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to