Hi, Seems so. It's equivalent to
Seq(MyProduct(new Timestamp(0), 10)).toDS.printSchema (and now I'm wondering why I didn't pick this variant) Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Fri, Aug 5, 2016 at 11:29 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Jacek, > > Is this line correct? > > spark.createDataset(Seq(MyProduct(new Timestamp(0), 10))).printSchema > > Thanks > > > Dr Mich Talebzadeh > > > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > http://talebzadehmich.wordpress.com > > > Disclaimer: Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. > > > > > On 5 August 2016 at 10:21, Jacek Laskowski <ja...@japila.pl> wrote: >> >> Hi Michael, >> >> Since we're at it, could you please point at the code where the >> optimization happens? I assume you're talking about Catalyst when >> whole-gening the code for queries. Is this nullability (NULL value) >> propagation perhaps? I'd appreciate (hoping that would improve my >> understanding of the low-level bits quite substantially). Thanks! >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Fri, Aug 5, 2016 at 1:24 AM, Michael Armbrust <mich...@databricks.com> >> wrote: >> > Nullable is an optimization for Spark SQL. It is telling spark to not >> > even >> > do an if check when accessing that field. >> > >> > In this case, your data is nullable, because timestamp is an object in >> > java >> > and you could put null there. >> > >> > On Thu, Aug 4, 2016 at 2:56 PM, luismattor <luismat...@gmail.com> wrote: >> >> >> >> Hi all, >> >> >> >> Consider the following case: >> >> >> >> import java.sql.Timestamp >> >> case class MyProduct(t: Timestamp, a: Float) >> >> val rdd = sc.parallelize(List(MyProduct(new Timestamp(0), 10))).toDF() >> >> rdd.printSchema() >> >> >> >> The output is: >> >> root >> >> |-- t: timestamp (nullable = true) >> >> |-- a: float (nullable = false) >> >> >> >> How can I set the timestamp column to be NOT nullable? >> >> >> >> Regards, >> >> Luis >> >> >> >> >> >> >> >> -- >> >> View this message in context: >> >> >> >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-nullable-field-when-create-DataFrame-using-case-class-tp27479.html >> >> Sent from the Apache Spark User List mailing list archive at >> >> Nabble.com. >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org