BryanCutler commented on a change in pull request #29686:
URL: https://github.com/apache/spark/pull/29686#discussion_r485823564
##########
File path: python/docs/source/user_guide/arrow_pandas.rst
##########
@@ -386,8 +386,8 @@ working with timestamps in ``pandas_udf``\s to get the best
performance, see
Recommended Pandas and PyArrow Versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-For usage with pyspark.sql, the supported versions of Pandas is 0.24.2 and
PyArrow is 0.15.1. Higher
-versions may be used, however, compatibility and data correctness can not be
guaranteed and should
+For usage with pyspark.sql, the minimum supported versions of Pandas is 0.23.2
and PyArrow is 1.0.0.
Review comment:
I changed the wording slightly to just state these are the minimum
versions. Github checks are currently testing with Pandas 1.1.2 and PyArrow
1.0.1, but the version is not fixed, so they will continue to install the
latest. We might also want to think about bumping the minimum version of Pandas
to 1.0.0 or higher.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]