As far as I am aware in newer Spark versions a DataFrame is the same as 
Dataset[Row].
In fact, performance depends on so many factors, so I am not sure such a 
comparison makes sense.

> On 8. Apr 2017, at 20:15, Shiyuan <gshy2...@gmail.com> wrote:
> 
> Hi Spark-users, 
>     I came across a few sources which mentioned DataFrame can be more 
> efficient than Dataset.  I can understand this is true because Dataset allows 
> functional transformation which Catalyst cannot look into and hence cannot 
> optimize well. But can DataFrame be more efficient than Dataset even if we 
> only use the relational transformation on dataset? If so, can anyone give 
> some explanation why  it is so? Any benchmark comparing dataset vs. 
> dataframe?   Thank you!
> 
> Shiyuan 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to