This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 6ab3c802a256 [SPARK-55358][PYTHON][INFRA][FOLLOW-UP] Do not apt-get 
install `python3-xxx`
6ab3c802a256 is described below

commit 6ab3c802a256159dd8aca5281def3f5253df6caf
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Mon Feb 9 07:20:46 2026 +0900

    [SPARK-55358][PYTHON][INFRA][FOLLOW-UP] Do not apt-get install `python3-xxx`
    
    ### What changes were proposed in this pull request?
    Do not apt-get install `python3-xxx`
    
    ### Why are the changes needed?
    In ubuntu 24, apt-get install python3-xxx will also install python3.12. It 
is error-prone and doesn't work with other python versions from `deadsnakes`, 
we should always install python packages via pip.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    ci
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #54197 from zhengruifeng/ubuntu_24_py_12_fu.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 dev/spark-test-image/python-312/Dockerfile | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/dev/spark-test-image/python-312/Dockerfile 
b/dev/spark-test-image/python-312/Dockerfile
index c88a17399fef..fd789bd298ee 100644
--- a/dev/spark-test-image/python-312/Dockerfile
+++ b/dev/spark-test-image/python-312/Dockerfile
@@ -42,21 +42,24 @@ RUN apt-get update && apt-get install -y \
     libssl-dev \
     openjdk-17-jdk-headless \
     python3.12 \
-    python3-pip \
-    python3-venv \
     pkg-config \
     tzdata \
     software-properties-common \
-    zlib1g-dev
+    zlib1g-dev \
+    && apt-get autoremove --purge -y \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
 
-ARG BASIC_PIP_PKGS="numpy pyarrow>=22.0.0 six==1.16.0 pandas==2.3.3 scipy 
plotly<6.0.0 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0 
scikit-learn>=1.3.2 pystack>=1.6.0 psutil"
-# Python deps for Spark Connect
-ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 protobuf==6.33.5 
googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20.3"
+# Setup virtual environment
+ENV VIRTUAL_ENV=/opt/spark-venv
+RUN python3.12 -m venv --without-pip $VIRTUAL_ENV
+ENV PATH="$VIRTUAL_ENV/bin:$PATH"
 
 # Install Python 3.12 packages
-ENV VIRTUAL_ENV /opt/spark-venv
-RUN python3.12 -m venv $VIRTUAL_ENV
-ENV PATH="$VIRTUAL_ENV/bin:$PATH"
+RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12
+
+ARG BASIC_PIP_PKGS="numpy pyarrow>=22.0.0 six==1.16.0 pandas==2.3.3 scipy 
plotly<6.0.0 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0 
scikit-learn>=1.3.2 pystack>=1.6.0 psutil"
+ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 protobuf==6.33.5 
googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20.3"
 
 RUN python3.12 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting 
$CONNECT_PIP_PKGS lxml && \
     python3.12 -m pip install torch torchvision --index-url 
https://download.pytorch.org/whl/cpu && \


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to