[
https://issues.apache.org/jira/browse/SPARK-33369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-33369:
------------------------------------
Assignee: Apache Spark (was: Gengliang Wang)
> Skip schema inference in DataframeWriter.save() if table provider supports
> external metadata
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-33369
> URL: https://issues.apache.org/jira/browse/SPARK-33369
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 3.1.0
> Reporter: Gengliang Wang
> Assignee: Apache Spark
> Priority: Major
>
> For all the v2 data sources which are not FileDataSourceV2, Spark always
> infers the table schema/partitioning on DataframeWriter.save().
> The inference of table schema/partitioning can be expensive. However, there
> is no such trait or flag for indicating a V2 source can use the input
> DataFrame's schema on DataframeWriter.save(). We can resolve the problem by
> adding a new expected behavior for the method
> TableProvider.supportsExternalMetadata():
> When TableProvider.supportsExternalMetadata() is true, Spark will use the
> input Dataframe's schema in DataframeWriter.save() and skip
> schema/partitioning inference.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]