I spent a few minutes poking around in the source code and found this: The data type representing None, used for the types that cannot be inferred.
https://github.com/apache/spark/blob/branch-2.1/python/pyspark/sql/types.py#L107-L113 Playing around a bit, this is the only use case that I could immediately come up with; you have some type of a placeholder field already in data, but its always null. If you let createDataFrame (and I bet other things like DataFrameReader would behave similarly) try to infer it directly, it will error out since it can't infer the schema automatically. Doing something like below will allow the data to be used. And, if memory serves, Hive has a concept of a Null data type also for these types of situations. In [9]: df = spark.createDataFrame([Row(id=1, val=None), Row(id=2, val=None)], schema=StructType([StructField('id', LongType()), StructField('val', NullType())])) In [10]: df.show() +---+----+ | id| val| +---+----+ | 1|null| | 2|null| +---+----+ In [11]: df.printSchema() root |-- id: long (nullable = true) |-- val: null (nullable = true) Nicholas Szandor Hakobian, Ph.D. Staff Data Scientist Rally Health nicholas.hakob...@rallyhealth.com On Sun, Feb 11, 2018 at 5:40 AM, Jean Georges Perrin <j...@jgp.net> wrote: > What is the purpose of DataTypes.NullType, specially as you are building a > schema? Have anyone used it or seen it as spart of a schema auto-generation? > > > (If I keep asking long enough, I may get an answer, no? :) ) > > > > On Feb 4, 2018, at 13:15, Jean Georges Perrin <j...@jgp.net> wrote: > > > > Any taker on this one? ;) > > > >> On Jan 29, 2018, at 16:05, Jean Georges Perrin <j...@jgp.net> wrote: > >> > >> Hi Sparkians, > >> > >> Can someone tell me what is the purpose of DataTypes.NullType, > specially as you are building a schema? > >> > >> Thanks > >> > >> jg > >> --------------------------------------------------------------------- > >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >