Here's a talk I gave on the topic: https://www.youtube.com/watch?v=i7l3JQRx7Qw http://www.slideshare.net/SparkSummit/structuring-spark-dataframes-datasets-and-streaming-by-michael-armbrust
On Mon, Jun 13, 2016 at 4:01 AM, Arun Patel <arunp.bigd...@gmail.com> wrote: > In Spark 2.0, DataFrames and Datasets are unified. DataFrame is simply an > alias for a Dataset of type row. I have few questions. > > 1) What does this really mean to an Application developer? > 2) Why this unification was needed in Spark 2.0? > 3) What changes can be observed in Spark 2.0 vs Spark 1.6? > 4) Compile time safety will be there for DataFrames too? > 5) Python API is supported for Datasets in 2.0? > > Thanks > Arun >