I'm clear on what a type alias is. My question is more that moving from e.g. Dataset[T] to Dataset[Row] involves throwing away information. Reading through code that uses the Dataframe alias, it's a little hard for me to know when that's intentional or not.
On Thu, Jun 16, 2016 at 2:50 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > There are different ways to view this. If its confusing to think that Source > API returning DataFrames, its equivalent to thinking that you are returning > a Dataset[Row], and DataFrame is just a shorthand. And > DataFrame/Datasetp[Row] is to Dataset[String] is what java Array[Object] is > to Array[String]. DataFrame is more general in a way, as every Dataset can > be boiled down to a DataFrame. So to keep the Source APIs general (and also > source-compatible with 1.x), they return DataFrame. > > On Thu, Jun 16, 2016 at 12:38 PM, Cody Koeninger <c...@koeninger.org> wrote: >> >> Is this really an internal / external distinction? >> >> For a concrete example, Source.getBatch seems to be a public >> interface, but returns DataFrame. >> >> On Thu, Jun 16, 2016 at 1:42 PM, Tathagata Das >> <tathagata.das1...@gmail.com> wrote: >> > DataFrame is a type alias of Dataset[Row], so externally it seems like >> > Dataset is the main type and DataFrame is a derivative type. >> > However, internally, since everything is processed as Rows, everything >> > uses >> > DataFrames, Type classes used in a Dataset is internally converted to >> > rows >> > for processing. . Therefore internally DataFrame is like "main" type >> > that is >> > used. >> > >> > On Thu, Jun 16, 2016 at 11:18 AM, Cody Koeninger <c...@koeninger.org> >> > wrote: >> >> >> >> Sorry, meant DataFrame vs Dataset >> >> >> >> On Thu, Jun 16, 2016 at 12:53 PM, Cody Koeninger <c...@koeninger.org> >> >> wrote: >> >> > Is there a principled reason why sql.streaming.* and >> >> > sql.execution.streaming.* are making extensive use of DataFrame >> >> > instead of Datasource? >> >> > >> >> > Or is that just a holdover from code written before the move / type >> >> > alias? >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >> > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org