[ 
https://issues.apache.org/jira/browse/SPARK-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984909#comment-13984909
 ] 

Michael Armbrust commented on SPARK-1649:
-----------------------------------------

Oh, I see.  I forgot that we would also need this inside of ArrayType.  Also, 
for MapType it seems like it only matters for the value, not the key as I'm not 
sure we would allow null keys.

This is something we need to consider. However, I think I'm going to change the 
title to something less prescriptive.  Could we just for now say that null 
values are not supported in arrays of parquet files?

> Figure out Nullability semantics for Array elements and Map values
> ------------------------------------------------------------------
>
>                 Key: SPARK-1649
>                 URL: https://issues.apache.org/jira/browse/SPARK-1649
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.1.0
>            Reporter: Andre Schumacher
>            Priority: Critical
>
> For the underlying storage layer it would simplify things such as schema 
> conversions, predicate filter determination and such to record in the data 
> type itself whether a column can be nullable. So the DataType type could look 
> like like this:
> abstract class DataType(nullable: Boolean = true)
> Concrete subclasses could then override the nullable val. Mostly this could 
> be left as the default but when types can be contained in nested types one 
> could optimize for, e.g., arrays with elements that are nullable and those 
> that are not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to