Hi All, I am working on using Datasets in 1.6.1 and eventually 2.0 when its released.
I am running the aggregate code below where I have a dataset where the row has a field uid: ds.groupBy(_.uid).count() // res0: org.apache.spark.sql.Dataset[(String, Long)] = [_1: string, _2: bigint] This works as expected, however, attempts to run select statements after fails: ds.groupBy(_.uid).count().select(_._1) // error: missing parameter type for expanded function ((x$2) => x$2._1) ds.groupBy(_.uid).count().select(_._1) I have tried several variants, but nothing seems to work. Below is the equivalent Dataframe code which works as expected: df.groupBy("uid").count().select("uid") Thanks! -- Pedro Rodriguez PhD Student in Distributed Machine Learning | CU Boulder UC Berkeley AMPLab Alumni ski.rodrig...@gmail.com | pedrorodriguez.io | 909-353-4423 Github: github.com/EntilZha | LinkedIn: https://www.linkedin.com/in/pedrorodriguezscience