Aleksander Eskilson created SPARK-17939:
-------------------------------------------
Summary: Spark-SQL Nullability: Optimizations vs. Enforcement
Clarification
Key: SPARK-17939
URL: https://issues.apache.org/jira/browse/SPARK-17939
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.0.0
Reporter: Aleksander Eskilson
Priority: Critical
The notion of Nullability of of StructFields in DataFrames and Datasets creates
some confusion. As has been pointed out previously [1], Nullability is a hint
to the Catalyst optimizer, and is not meant to be a type-level enforcement.
Allowing null fields can also help the reader successfully parse certain types
of more loosely-typed data, like JSON and CSV, where null values are common,
rather than just failing.
There's already been some movement to clarify the meaning of Nullable in the
API, but also some requests for a (perhaps completely separate) type-level
implementation of Nullable that can act as an enforcement contract.
This bug is logged here to discuss and clarify this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]