I get your point haha and I also think of it as DataFrame being a specific kind of Dataset. Mike
On Tue, May 1, 2018, 7:27 AM Lalwani, Jayesh <jayesh.lalw...@capitalone.com> wrote: > Neither. > > > > All women are humans. Not all humans are women. You wouldn’t say that a > woman is a subset of a human. > > > > All DataFrames are DataSets. Not all Datasets are DataFrames. The “subset” > relationship doesn’t apply here. A DataFrame is a specialized type of > DataSet > > > > *From: *Michael Artz <michaelea...@gmail.com> > *Date: *Saturday, April 28, 2018 at 9:24 AM > *To: *"user @spark" <user@spark.apache.org> > *Subject: *Dataframe vs dataset > > > > Hi, > > > > I use Spark everyday and I have a good grip on the basics of Spark, so > this question isnt for myself. But this came up and I wanted to see what > other Spark users would say, and I dont want to influence your answer. And > SO is weird about polls. The question is > > > > "Which one do you feel is accurate... Dataset is a subset of DataFrame, > or DataFrame a subset of Dataset?" > > ------------------------------ > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates and may only be used > solely in performance of work or services for Capital One. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the intended > recipient, you are hereby notified that any review, retransmission, > dissemination, distribution, copying or other use of, or taking of any > action in reliance upon this information is strictly prohibited. If you > have received this communication in error, please contact the sender and > delete the material from your computer. >