Anil Dasari created SPARK-40507:
-----------------------------------
Summary: Spark creates an optional columns in hive table for
fields that are not null
Key: SPARK-40507
URL: https://issues.apache.org/jira/browse/SPARK-40507
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.3.0
Reporter: Anil Dasari
Dataframe saveAsTable sets all columns as optional/nullable while creating the
table here
[https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L531]
(`outputColumns.toStructType.asNullable`)
This makes source parquet schema and hive table schema doesn't match and is
problematic when large dataframe(s) process uses hive as temporary storage to
avoid the memory pressure.
Hive 3.x supports non null constraints on table columns. Please add support non
null constraints on Spark sql hive table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]