[
https://issues.apache.org/jira/browse/SPARK-21769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiao Li updated SPARK-21769:
----------------------------
Summary: Add a table property for Hive-serde tables to make Spark always
respect schemas inferred by Spark SQL (was: Add a table property for
Hive-serde tables to control Spark always respecting schemas inferred by Spark
SQL)
> Add a table property for Hive-serde tables to make Spark always respect
> schemas inferred by Spark SQL
> -----------------------------------------------------------------------------------------------------
>
> Key: SPARK-21769
> URL: https://issues.apache.org/jira/browse/SPARK-21769
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Xiao Li
> Assignee: Xiao Li
>
> For Hive-serde tables, we always respect the schema stored in Hive metastore,
> because the schema could be altered by the other engines that share the same
> metastore. Thus, we always trust the metastore-controlled schema for
> Hive-serde tables when the schemas are different (without considering the
> nullability and cases). However, in some scenarios, Hive metastore also could
> INCORRECTLY overwrite the schemas when the serde and Hive metastore built-in
> serde are different.
> The proposed solution is to introduce a table property for such scenarios.
> For a specific Hive-serde table, users can manually setting such table
> property for asking Spark for always respect Spark-inferred schema instead of
> trusting metastore-controlled schema. By default, it is off.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]