should we change "def schema" to show the materialized schema?
On Wed, Jan 25, 2017 at 1:04 PM, Michael Armbrust
wrote:
> Encoders are just an object based view on a Dataset. Until you actually
> materialize and object, they are not used and thus will not change the
> schema of the dataframe.
>
Encoders are just an object based view on a Dataset. Until you actually
materialize and object, they are not used and thus will not change the
schema of the dataframe.
On Tue, Jan 24, 2017 at 8:28 AM, Koert Kuipers wrote:
> scala> val x = Seq("a", "b").toDF("x")
> x: org.apache.spark.sql.DataFr
Hi,
AFAIK `Dataset#printSchema` just prints an output schema of the logical
plan that the Dataset has.
The logical plans in your example are as follows;
---
scala> x.as[Array[Byte]].explain(true)
== Analyzed Logical Plan ==
x: string
Project [value#1 AS x#3]
+- LocalRelation [value#1]
scal
scala> val x = Seq("a", "b").toDF("x")
x: org.apache.spark.sql.DataFrame = [x: string]
scala> x.as[Array[Byte]].printSchema
root
|-- x: string (nullable = true)
scala> x.as[Array[Byte]].map(x => x).printSchema
root
|-- value: binary (nullable = true)
why does the first schema show string inste