Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22944#discussion_r232573269
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
---
@@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest with
SharedSQLContext {
df.where($"city".contains(new java.lang.Character('A'))),
Seq(Row("Amsterdam")))
}
+
+ test("SPARK-25942: typed aggregation on primitive type") {
+ val ds = Seq(1, 2, 3).toDS()
+
+ val agg = ds.groupByKey(_ >= 2)
+ .agg(sum("value").as[Long], sum($"value" + 1).as[Long])
--- End diff --
Btw, the definition of `agg` functions in `KeyValueGroupedDataset` look
like:
```scala
def agg[U1, U2](col1: TypedColumn[V, U1], col2: TypedColumn[V, U2]):
Dataset[(K, U1, U2)]
```
So the inputs to `agg` are `TypedColumn` with `V` as input type.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]