xinrong-meng commented on code in PR #48677:
URL: https://github.com/apache/spark/pull/48677#discussion_r1839664577
##########
python/pyspark/sql/session.py:
##########
@@ -1391,13 +1392,18 @@ def createDataFrame( # type: ignore[misc]
if ``samplingRatio`` is ``None``. This option is effective only
when the input is
:class:`RDD`.
verifySchema : bool, optional
- verify data types of every row against schema. Enabled by default.
- When the input is :class:`pyarrow.Table` or when the input class is
- :class:`pandas.DataFrame` and
`spark.sql.execution.arrow.pyspark.enabled` is enabled,
- this option is not effective. It follows Arrow type coercion. This
option is not
- supported with Spark Connect.
+ verify data types of every row against schema.
+ If not provided, createDataFrame with
+ - pyarrow.Table, verifySchema=False
+ - pandas.DataFrame with Arrow optimization, verifySchema=False
+ - pandas.DataFrame without Arrow optimization, verifySchema=True
+ - regular Python instances, verifySchema=True
+ Arrow optimization is enabled/disabled via
`spark.sql.execution.arrow.pyspark.enabled`.
.. versionadded:: 2.1.0
+ .. versionchanged:: 4.0.0
+ Adjusts default value to pyspark._NoValue.
Review Comment:
Makes sense! Removed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]