Re: SparkSQL vs Dataframe vs Dataset

2021-12-06 Thread yonghua
@spark" Envoyé: lundi 6 Décembre 2021 21:49 Objet : SparkSQL vs Dataframe vs Dataset   Hi Users, Is there any use case when we need to use SQL vs Dataframe vs Dataset? Is there any recommended approach or any advantage/performance gain over others? Thanks Rajat  

SparkSQL vs Dataframe vs Dataset

2021-12-06 Thread rajat kumar
Hi Users, Is there any use case when we need to use SQL vs Dataframe vs Dataset? Is there any recommended approach or any advantage/performance gain over others? Thanks Rajat

Re: Dataframe vs Dataset dilemma: either Row parsing or no filter push-down

2018-06-18 Thread Koert Kuipers
we use DataFrame and RDD. Dataset not only has issues with predicate pushdown, it also adds shufffles at times where it shouldn't. and there is some overhead from the encoders themselves, because under the hood it is still just Row objects. On Mon, Jun 18, 2018 at 5:00 PM, Valery Khamenya wrote:

Dataframe vs Dataset dilemma: either Row parsing or no filter push-down

2018-06-18 Thread Valery Khamenya
Hi Spark gurus, I was surprised to read here: https://stackoverflow.com/questions/50129411/why-is-predicate-pushdown-not-used-in-typed-dataset-api-vs-untyped-dataframe-ap that filters are not pushed down in typed Datasets and one should rather stick to Dataframes. But writing code for groupByKey

Re: Dataframe vs dataset

2018-05-01 Thread Michael Artz
> > > All DataFrames are DataSets. Not all Datasets are DataFrames. The “subset” > relationship doesn’t apply here. A DataFrame is a specialized type of > DataSet > > > > *From: *Michael Artz > *Date: *Saturday, April 28, 2018 at 9:24 AM > *To: *"user @spar

Re: Dataframe vs dataset

2018-05-01 Thread Lalwani, Jayesh
: Saturday, April 28, 2018 at 9:24 AM To: "user @spark" Subject: Dataframe vs dataset Hi, I use Spark everyday and I have a good grip on the basics of Spark, so this question isnt for myself. But this came up and I wanted to see what other Spark users would say, and I dont want to infl

Re: Dataframe vs dataset

2018-04-28 Thread Michael Artz
Ok from the language you used, you are saying kind of that Dataset is a subset of Dataframe. I would disagree because to me a DataFrame is just a Dataset of org.spache.spark.sql.Row On Sat, Apr 28, 2018, 8:34 AM Marco Mistroni wrote: > Imho .neither..I see datasets as typed df and therefore ds

Re: Dataframe vs dataset

2018-04-28 Thread Marco Mistroni
Imho .neither..I see datasets as typed df and therefore ds are enhanced df Feel free to disagree.. Kr On Sat, Apr 28, 2018, 2:24 PM Michael Artz wrote: > Hi, > > I use Spark everyday and I have a good grip on the basics of Spark, so > this question isnt for myself. But this came up and I wanted

Dataframe vs dataset

2018-04-28 Thread Michael Artz
Hi, I use Spark everyday and I have a good grip on the basics of Spark, so this question isnt for myself. But this came up and I wanted to see what other Spark users would say, and I dont want to influence your answer. And SO is weird about polls. The question is "Which one do you feel is accu