This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new da0c31c629ca [SPARK-46745][INFRA] Purge pip cache in dockerfile
da0c31c629ca is described below

commit da0c31c629ca378704cb7f97c9983be9b6ca96c4
Author: Ruifeng Zheng <ruife...@apache.org>
AuthorDate: Wed Jan 17 20:11:11 2024 -0800

    [SPARK-46745][INFRA] Purge pip cache in dockerfile
    
    ### What changes were proposed in this pull request?
    Purge pip cache in dockerfile
    
    ### Why are the changes needed?
    to save 4~5G disk space:
    
    before
    
    
https://github.com/zhengruifeng/spark/actions/runs/7541725028/job/20530432798
    
    ```
    #45 [39/39] RUN df -h
    #45 0.090 Filesystem      Size  Used Avail Use% Mounted on
    #45 0.090 overlay          84G   70G   15G  83% /
    #45 0.090 tmpfs            64M     0   64M   0% /dev
    #45 0.090 shm              64M     0   64M   0% /dev/shm
    #45 0.090 /dev/root        84G   70G   15G  83% /etc/resolv.conf
    #45 0.090 tmpfs           7.9G     0  7.9G   0% /proc/acpi
    #45 0.090 tmpfs           7.9G     0  7.9G   0% /sys/firmware
    #45 0.090 tmpfs           7.9G     0  7.9G   0% /proc/scsi
    #45 DONE 2.0s
    ```
    
    after
    
    
https://github.com/zhengruifeng/spark/actions/runs/7549204209/job/20552796796
    
    ```
    #48 [42/43] RUN python3.12 -m pip cache purge
    #48 0.670 Files removed: 392
    #48 DONE 0.7s
    
    #49 [43/43] RUN df -h
    #49 0.075 Filesystem      Size  Used Avail Use% Mounted on
    #49 0.075 overlay          84G   65G   19G  79% /
    #49 0.075 tmpfs            64M     0   64M   0% /dev
    #49 0.075 shm              64M     0   64M   0% /dev/shm
    #49 0.075 /dev/root        84G   65G   19G  79% /etc/resolv.conf
    #49 0.075 tmpfs           7.9G     0  7.9G   0% /proc/acpi
    #49 0.075 tmpfs           7.9G     0  7.9G   0% /sys/firmware
    #49 0.075 tmpfs           7.9G     0  7.9G   0% /proc/scsi
    ```
    ### Does this PR introduce _any_ user-facing change?
    no, infra-only
    
    ### How was this patch tested?
    ci
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #44768 from zhengruifeng/infra_docker_cleanup.
    
    Authored-by: Ruifeng Zheng <ruife...@apache.org>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 .github/workflows/build_and_test.yml | 4 ----
 dev/infra/Dockerfile                 | 8 +++++++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 493ed0c413a9..51bbdb9fcb35 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -417,10 +417,6 @@ jobs:
     - name: Free up disk space
       shell: 'script -q -e -c "bash {0}"'
       run: |
-        if [[ "$MODULES_TO_TEST" != *"pyspark-ml"* ]] && [[ "$BRANCH" != 
"branch-3.5" ]]; then
-          # uninstall libraries dedicated for ML testing
-          python3.9 -m pip uninstall -y torch torchvision torcheval torchtnt 
tensorboard mlflow deepspeed
-        fi
         if [ -f ./dev/free_disk_space_container ]; then
           ./dev/free_disk_space_container
         fi
diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile
index 78814ace9b2e..54f62bbc8202 100644
--- a/dev/infra/Dockerfile
+++ b/dev/infra/Dockerfile
@@ -19,7 +19,7 @@
 # See also in https://hub.docker.com/_/ubuntu
 FROM ubuntu:focal-20221019
 
-ENV FULL_REFRESH_DATE 20231117
+ENV FULL_REFRESH_DATE 20240117
 
 ENV DEBIAN_FRONTEND noninteractive
 ENV DEBCONF_NONINTERACTIVE_SEEN true
@@ -104,6 +104,7 @@ RUN python3.9 -m pip install $BASIC_PIP_PKGS 
unittest-xml-reporting $CONNECT_PIP
 # Add torch as a testing dependency for TorchDistributor and 
DeepspeedTorchDistributor
 RUN python3.9 -m pip install 'torch<=2.0.1' torchvision --index-url 
https://download.pytorch.org/whl/cpu
 RUN python3.9 -m pip install deepspeed torcheval
+RUN python3.9 -m pip cache purge
 
 # Install Python 3.10 at the last stage to avoid breaking Python 3.9
 RUN add-apt-repository ppa:deadsnakes/ppa
@@ -114,6 +115,7 @@ RUN curl -sS https://bootstrap.pypa.io/get-pip.py | 
python3.10
 RUN python3.10 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting 
$CONNECT_PIP_PKGS
 RUN python3.10 -m pip install 'torch<=2.0.1' torchvision --index-url 
https://download.pytorch.org/whl/cpu
 RUN python3.10 -m pip install deepspeed torcheval
+RUN python3.10 -m pip cache purge
 
 # Install Python 3.11 at the last stage to avoid breaking the existing Python 
installations
 RUN add-apt-repository ppa:deadsnakes/ppa
@@ -124,6 +126,7 @@ RUN curl -sS https://bootstrap.pypa.io/get-pip.py | 
python3.11
 RUN python3.11 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting 
$CONNECT_PIP_PKGS
 RUN python3.11 -m pip install 'torch<=2.0.1' torchvision --index-url 
https://download.pytorch.org/whl/cpu
 RUN python3.11 -m pip install deepspeed torcheval
+RUN python3.11 -m pip cache purge
 
 # Install Python 3.12 at the last stage to avoid breaking the existing Python 
installations
 RUN add-apt-repository ppa:deadsnakes/ppa
@@ -137,3 +140,6 @@ RUN python3.12 -m pip install $BASIC_PIP_PKGS 
$CONNECT_PIP_PKGS lxml
 RUN python3.12 -m pip install --pre torch --index-url 
https://download.pytorch.org/whl/nightly/cpu
 RUN python3.12 -m pip install torchvision --index-url 
https://download.pytorch.org/whl/cpu
 RUN python3.12 -m pip install torcheval
+RUN python3.12 -m pip cache purge
+
+RUN df -h


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to