There is this pull request: https://github.com/apache/spark/pull/5713
We mean to merge it for 1.5. Maybe you can help review it too? On Mon, Jul 27, 2015 at 11:23 AM, Vyacheslav Baranov < slavik.bara...@gmail.com> wrote: > Hi all, > > For now it's possible to convert RDD of case class to DataFrame: > > case class Person(name: String, age: Int) > > val people: RDD[Person] = ... > val df = sqlContext.createDataFrame(people) > > but backward conversion is not possible with existing API, so currently > code looks like this (example from documentation): > > teenagers.map(t => "Name: " + t.getAs[String]("name")) > > whereas it would be much more convenient to use RDD of case class: > > teenagers.rdd[Person].map("Name: " + _.name) > > > I've implemented proof of concept library that allows to convert DataFrame > to typed RDD with "Pimp my library" pattern. It adds some typesafety > (conversion fails before running distributed operation if some fields have > incompatible types) and it's much more convenient when working with nested > rows, for example: > > case class Room(number: Int, visitors: Seq[Person]) > > roomsDf.explode[Seq[Row], Person]("visitors", > "visitor")(_.map(rowToPerson)) > > Would the community be interested in having this functionality in core? > > Regards, > Vyacheslav > >