This is an automated email from the ASF dual-hosted git repository. yikun pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 2698d6bf10b [SPARK-40838][INFRA][TESTS] Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest 2698d6bf10b is described below commit 2698d6bf10b92e71e8af88fedb4e7c9e0f304416 Author: Yikun Jiang <yikunk...@gmail.com> AuthorDate: Thu Oct 20 15:54:18 2022 +0800 [SPARK-40838][INFRA][TESTS] Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest ### What changes were proposed in this pull request? Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest ### Why are the changes needed? - Upgrade infra base image to `focal-20220922` (Ubuntu 20.04 currently latest) - Infra Image Python version updated. - numpy 1.23.3 --> 1.23.4 - mlflow 1.28.0 --> 1.29.0 - matplotlib 3.5.3 --> 3.6.1 - pip 22.2.2 --> 22.3 - scipy 1.9.1 --> 1.9.3 Full list: https://www.diffchecker.com/e6eZZaYn - Fix ps.mlfow doctest (due to mlflow upgrade): ``` ********************************************************************** File "/__w/spark/spark/python/pyspark/pandas/mlflow.py", line 158, in pyspark.pandas.mlflow.load_model Failed example: with mlflow.start_run(): lr = LinearRegression() lr.fit(train_x, train_y) mlflow.sklearn.log_model(lr, "model") Expected: LinearRegression(...) Got: LinearRegression() <mlflow.models.model.ModelInfo object at 0x7fef9578deb0> ``` ### Does this PR introduce _any_ user-facing change? No, dev only ### How was this patch tested? All CI passed Closes #38304 from Yikun/SPARK-40838. Authored-by: Yikun Jiang <yikunk...@gmail.com> Signed-off-by: Yikun Jiang <yikunk...@gmail.com> --- dev/infra/Dockerfile | 4 ++-- python/pyspark/pandas/mlflow.py | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile index ccf0c932b0e..2a70bd3f98f 100644 --- a/dev/infra/Dockerfile +++ b/dev/infra/Dockerfile @@ -17,9 +17,9 @@ # Image for building and testing Spark branches. Based on Ubuntu 20.04. # See also in https://hub.docker.com/_/ubuntu -FROM ubuntu:focal-20220801 +FROM ubuntu:focal-20220922 -ENV FULL_REFRESH_DATE 20220706 +ENV FULL_REFRESH_DATE 20221019 ENV DEBIAN_FRONTEND noninteractive ENV DEBCONF_NONINTERACTIVE_SEEN true diff --git a/python/pyspark/pandas/mlflow.py b/python/pyspark/pandas/mlflow.py index 094215743e2..469349b37ee 100644 --- a/python/pyspark/pandas/mlflow.py +++ b/python/pyspark/pandas/mlflow.py @@ -159,7 +159,7 @@ def load_model( ... lr = LinearRegression() ... lr.fit(train_x, train_y) ... mlflow.sklearn.log_model(lr, "model") - LinearRegression(...) + LinearRegression... Now that our model is logged using MLflow, we load it back and apply it on a pandas-on-Spark dataframe: --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org