This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new b6a98131956c [SPARK-53835][INFRA] Install `pyarrow/torch/torchvision`
packages to `Python 3.14` Dockefile
b6a98131956c is described below
commit b6a98131956ced6c8ac6d547fcb931a04ca30b18
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Mon Oct 27 21:22:08 2025 -0700
[SPARK-53835][INFRA] Install `pyarrow/torch/torchvision` packages to
`Python 3.14` Dockefile
### What changes were proposed in this pull request?
This PR aims to install `pyarrow/torch/torchvision` packages to Python 3.14
Dockefile. After this PR, the only missing dependency will be `MLFlow`.
### Why are the changes needed?
Finally, they supports `Python 3.14` officially.
- https://pypi.org/project/pyarrow/22.0.0/ (2025-10-24)
- https://pypi.org/project/torch/2.9.0/
### Does this PR introduce _any_ user-facing change?
No, this is an infra change.
### How was this patch tested?
Manual review. After merging, `Python 3.14` CI will provide a test coverage
for this.
https://github.com/apache/spark/actions/workflows/build_python_3.14.yml
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52751 from dongjoon-hyun/SPARK-53835.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
dev/spark-test-image/python-314/Dockerfile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/dev/spark-test-image/python-314/Dockerfile
b/dev/spark-test-image/python-314/Dockerfile
index 842a228f05b7..5ab4154dd0f7 100644
--- a/dev/spark-test-image/python-314/Dockerfile
+++ b/dev/spark-test-image/python-314/Dockerfile
@@ -67,7 +67,7 @@ RUN apt-get update && apt-get install -y \
&& rm -rf /var/lib/apt/lists/*
-ARG BASIC_PIP_PKGS="numpy six==1.16.0 pandas==2.3.3 scipy plotly<6.0.0
coverage matplotlib openpyxl memory-profiler>=0.61.0 scikit-learn>=1.3.2"
+ARG BASIC_PIP_PKGS="numpy pyarrow>=22.0.0 six==1.16.0 pandas==2.3.3 scipy
plotly<6.0.0 coverage matplotlib openpyxl memory-profiler>=0.61.0
scikit-learn>=1.3.2"
# Python deps for Spark Connect
ARG CONNECT_PIP_PKGS="grpcio==1.75.1 grpcio-status==1.71.2 protobuf==5.29.5
googleapis-common-protos==1.65.0 graphviz==0.20.3"
@@ -75,5 +75,6 @@ ARG CONNECT_PIP_PKGS="grpcio==1.75.1 grpcio-status==1.71.2
protobuf==5.29.5 goog
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.14
RUN python3.14 -m pip install --ignore-installed blinker>=1.6.2 # mlflow needs
this
RUN python3.14 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting
$CONNECT_PIP_PKGS lxml && \
+ python3.14 -m pip install torch torchvision --index-url
https://download.pytorch.org/whl/cpu && \
python3.14 -m pip install torcheval && \
python3.14 -m pip cache purge
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]