It is called groupByKey now. Similar to joinWith, the schema produced by relational joins and aggregations is different than what you would expect when working with objects. So, when combining DataFrame+Dataset we renamed these functions to make this distinction clearer.
On Sun, Apr 3, 2016 at 12:23 PM, Jacek Laskowski <[email protected]> wrote: > Hi, > > (since 2.0.0-SNAPSHOT it's more for dev not user) > > With today's master I'm getting the following: > > scala> ds > res14: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int] > > // WHY?! > scala> ds.groupBy(_._1) > <console>:26: error: missing parameter type for expanded function > ((x$1) => x$1._1) > ds.groupBy(_._1) > ^ > > scala> ds.filter(_._1.size > 10) > res23: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int] > > It's even on the slide of Michael in > https://youtu.be/i7l3JQRx7Qw?t=7m38s from Spark Summit East?! Am I > doing something wrong? Please guide. > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
