[
https://issues.apache.org/jira/browse/SPARK-50146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xinrong Meng updated SPARK-50146:
---------------------------------
Description:
{{verifySchema}} parameter of createDataFrame decides whether to verify data
types of every row against schema.
h3. In Spark Classic
Now it only takes effect for with createDataFrame with
* regular Python instances
We propose to make it work with createDataFrame with
* {{pyarrow.Table}}
* {{pandas.DataFrame}} with Arrow optimization
* {{pandas.DataFrame}} without Arrow optimization
h3. In Spark Connect
Now it does not take effect.
We propose to make it work with all inputs.
was:Currently, there is no way to disable schema validation when creating
DataFrames from Arrow tables, unlike other methods of creating DataFrames, such
as from Pandas series.
> Consolidate configurable schema verification of createDataFrame
> ---------------------------------------------------------------
>
> Key: SPARK-50146
> URL: https://issues.apache.org/jira/browse/SPARK-50146
> Project: Spark
> Issue Type: Umbrella
> Components: Connect, PySpark
> Affects Versions: 4.0.0
> Reporter: Xinrong Meng
> Priority: Major
> Labels: pull-request-available
>
> {{verifySchema}} parameter of createDataFrame decides whether to verify data
> types of every row against schema.
> h3. In Spark Classic
> Now it only takes effect for with createDataFrame with
> * regular Python instances
> We propose to make it work with createDataFrame with
> * {{pyarrow.Table}}
> * {{pandas.DataFrame}} with Arrow optimization
> * {{pandas.DataFrame}} without Arrow optimization
> h3. In Spark Connect
> Now it does not take effect.
> We propose to make it work with all inputs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]