This is an automated email from the ASF dual-hosted git repository.

yikun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 2698d6bf10b [SPARK-40838][INFRA][TESTS] Upgrade infra base image to 
focal-20220922 and fix ps.mlflow doctest
2698d6bf10b is described below

commit 2698d6bf10b92e71e8af88fedb4e7c9e0f304416
Author: Yikun Jiang <yikunk...@gmail.com>
AuthorDate: Thu Oct 20 15:54:18 2022 +0800

    [SPARK-40838][INFRA][TESTS] Upgrade infra base image to focal-20220922 and 
fix ps.mlflow doctest
    
    ### What changes were proposed in this pull request?
    Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest
    
    ### Why are the changes needed?
    - Upgrade infra base image to `focal-20220922` (Ubuntu 20.04 currently 
latest)
    - Infra Image Python version updated.
      - numpy 1.23.3 --> 1.23.4
      - mlflow 1.28.0 --> 1.29.0
      - matplotlib 3.5.3 --> 3.6.1
      - pip 22.2.2 --> 22.3
      - scipy 1.9.1 --> 1.9.3
    
      Full list: https://www.diffchecker.com/e6eZZaYn
    - Fix ps.mlfow doctest (due to mlflow upgrade):
    ```
    **********************************************************************
    File "/__w/spark/spark/python/pyspark/pandas/mlflow.py", line 158, in 
pyspark.pandas.mlflow.load_model
    Failed example:
        with mlflow.start_run():
            lr = LinearRegression()
            lr.fit(train_x, train_y)
            mlflow.sklearn.log_model(lr, "model")
    Expected:
        LinearRegression(...)
    Got:
        LinearRegression()
        <mlflow.models.model.ModelInfo object at 0x7fef9578deb0>
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    No, dev only
    
    ### How was this patch tested?
    All CI passed
    
    Closes #38304 from Yikun/SPARK-40838.
    
    Authored-by: Yikun Jiang <yikunk...@gmail.com>
    Signed-off-by: Yikun Jiang <yikunk...@gmail.com>
---
 dev/infra/Dockerfile            | 4 ++--
 python/pyspark/pandas/mlflow.py | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile
index ccf0c932b0e..2a70bd3f98f 100644
--- a/dev/infra/Dockerfile
+++ b/dev/infra/Dockerfile
@@ -17,9 +17,9 @@
 
 # Image for building and testing Spark branches. Based on Ubuntu 20.04.
 # See also in https://hub.docker.com/_/ubuntu
-FROM ubuntu:focal-20220801
+FROM ubuntu:focal-20220922
 
-ENV FULL_REFRESH_DATE 20220706
+ENV FULL_REFRESH_DATE 20221019
 
 ENV DEBIAN_FRONTEND noninteractive
 ENV DEBCONF_NONINTERACTIVE_SEEN true
diff --git a/python/pyspark/pandas/mlflow.py b/python/pyspark/pandas/mlflow.py
index 094215743e2..469349b37ee 100644
--- a/python/pyspark/pandas/mlflow.py
+++ b/python/pyspark/pandas/mlflow.py
@@ -159,7 +159,7 @@ def load_model(
     ...     lr = LinearRegression()
     ...     lr.fit(train_x, train_y)
     ...     mlflow.sklearn.log_model(lr, "model")
-    LinearRegression(...)
+    LinearRegression...
 
     Now that our model is logged using MLflow, we load it back and apply it on 
a pandas-on-Spark
     dataframe:


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to