This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new c10eb7c52fe [SPARK-42113][PS][INFRA] Upgrade pandas to 1.5.3
c10eb7c52fe is described below

commit c10eb7c52fe35e94818e3f5427b7687e351c8c37
Author: itholic <[email protected]>
AuthorDate: Thu Jan 19 08:54:47 2023 -0800

    [SPARK-42113][PS][INFRA] Upgrade pandas to 1.5.3
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to upgrade pandas to 1.5.3.
    
    See [What's new in 
1.5.3](https://pandas.pydata.org/docs/whatsnew/v1.5.3.html) for more detail.
    
    ### Why are the changes needed?
    
    We should support latest pandas for pandas API on Spark.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    The existing CI should pass.
    
    Closes #39651 from itholic/pandas_1.5.3.
    
    Authored-by: itholic <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 dev/infra/Dockerfile                       | 4 ++--
 python/pyspark/pandas/supported_api_gen.py | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile
index 60388ce5765..f226058f186 100644
--- a/dev/infra/Dockerfile
+++ b/dev/infra/Dockerfile
@@ -64,8 +64,8 @@ RUN Rscript -e "devtools::install_version('roxygen2', 
version='7.2.0', repos='ht
 # See more in SPARK-39735
 ENV R_LIBS_SITE 
"/usr/local/lib/R/site-library:${R_LIBS_SITE}:/usr/lib/R/library"
 
-RUN pypy3 -m pip install numpy 'pandas<=1.5.2' scipy coverage matplotlib
-RUN python3.9 -m pip install numpy pyarrow 'pandas<=1.5.2' scipy 
unittest-xml-reporting plotly>=4.8 sklearn 'mlflow>=1.0' coverage matplotlib 
openpyxl 'memory-profiler==0.60.0' 'scikit-learn==1.1.*'
+RUN pypy3 -m pip install numpy 'pandas<=1.5.3' scipy coverage matplotlib
+RUN python3.9 -m pip install numpy pyarrow 'pandas<=1.5.3' scipy 
unittest-xml-reporting plotly>=4.8 sklearn 'mlflow>=1.0' coverage matplotlib 
openpyxl 'memory-profiler==0.60.0' 'scikit-learn==1.1.*'
 
 # Add Python deps for Spark Connect.
 RUN python3.9 -m pip install grpcio protobuf googleapis-common-protos 
grpcio-status
diff --git a/python/pyspark/pandas/supported_api_gen.py 
b/python/pyspark/pandas/supported_api_gen.py
index 301e6a2f9b7..87986a71cf5 100644
--- a/python/pyspark/pandas/supported_api_gen.py
+++ b/python/pyspark/pandas/supported_api_gen.py
@@ -98,7 +98,7 @@ def generate_supported_api(output_rst_file_path: str) -> None:
 
     Write supported APIs documentation.
     """
-    pandas_latest_version = "1.5.2"
+    pandas_latest_version = "1.5.3"
     if LooseVersion(pd.__version__) != LooseVersion(pandas_latest_version):
         msg = (
             "Warning: Latest version of pandas (%s) is required to generate 
the documentation; "


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to