Yes. In LazySimpleSerde/SequenceFile/TextFile, "\N" is used as NULL. (It is a table property: serialization.null.format)
In ColumnSerDe/RCFile, there is no NULL stored. (zero byte, column byte length is zero). But RCFile/ColumnarSerde also use this property when do serializing to determine if a column is a null or not. ( This is unavoidable because client can only pass a string to serde and let serde serialize it. need some special charater to represent NULL). On Mon, Aug 9, 2010 at 11:46 AM, Ning Zhang <[email protected]> wrote: > How it is serialized/deserialized is determined by specific serde. NULL is > serialized as \N by SimpleLazySerDe (default serde for text). RCFile > (ColumnarSerDe) uses the same default parameters as LazySimpleSerDe. > Unless I missed something, NULL serialization/deserialization is type > independent (at least in LazySimpleSerDe). > On Aug 9, 2010, at 9:42 AM, Pradeep Kamath wrote: > > Hi, > What value does hive expect in the data for a column to be treated as > null? I tried some permutations on a text data based table but couldn’t > figure out what the correct representation was. I tried empty string, the > string NULL and the string null for a string column and in all three cases > the “is null” operator returned false. > > A couple of related questions: > - Does the representation of null depend on the type of the column – is it > different for string Vs non-string columns? > - Is the representation of null different for different storage formats – > text Vs RCFile Vs SequenceFile – I am particularly interested in text and > RCFile. > > Thanks in advance, > > Pradeep >
