This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 44f61d5117fc [SPARK-54943][PYTHON][TESTS][FOLLOW-UP] Disable
`test_pyarrow_array_cast`
44f61d5117fc is described below
commit 44f61d5117fc40673bb339612f1b27abe3b781f7
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu Jan 29 10:55:12 2026 +0800
[SPARK-54943][PYTHON][TESTS][FOLLOW-UP] Disable `test_pyarrow_array_cast`
### What changes were proposed in this pull request?
Disable `test_pyarrow_array_cast`
### Why are the changes needed?
it is failing all scheduled jobs
### Does this PR introduce _any_ user-facing change?
no, test-only
### How was this patch tested?
will monitor the workflows
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #54049 from zhengruifeng/test_ubuntu_24.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
.github/workflows/build_and_test.yml | 1 +
dev/spark-test-image/python-312/Dockerfile | 29 +++++++++++-------------
python/pyspark/sql/tests/arrow/test_arrow_udf.py | 8 +++----
3 files changed, 18 insertions(+), 20 deletions(-)
diff --git a/.github/workflows/build_and_test.yml
b/.github/workflows/build_and_test.yml
index 77719c322233..26a1f72cdae7 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -640,6 +640,7 @@ jobs:
export SKIP_PACKAGING=false
echo "Python Packaging Tests Enabled!"
fi
+ export PATH="/opt/spark-venv/bin:$PATH"
if [ ! -z "$PYTHON_TO_TEST" ]; then
./dev/run-tests --parallelism 1 --modules "$MODULES_TO_TEST"
--python-executables "$PYTHON_TO_TEST"
else
diff --git a/dev/spark-test-image/python-312/Dockerfile
b/dev/spark-test-image/python-312/Dockerfile
index 4d15f5203124..1a22da26b0e9 100644
--- a/dev/spark-test-image/python-312/Dockerfile
+++ b/dev/spark-test-image/python-312/Dockerfile
@@ -15,9 +15,9 @@
# limitations under the License.
#
-# Image for building and testing Spark branches. Based on Ubuntu 22.04.
+# Image for building and testing Spark branches. Based on Ubuntu 24.04.
# See also in https://hub.docker.com/_/ubuntu
-FROM ubuntu:jammy-20240911.1
+FROM ubuntu:noble
LABEL org.opencontainers.image.authors="Apache Spark project
<[email protected]>"
LABEL org.opencontainers.image.licenses="Apache-2.0"
LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For PySpark
with Python 3.12"
@@ -41,28 +41,25 @@ RUN apt-get update && apt-get install -y \
libopenblas-dev \
libssl-dev \
openjdk-17-jdk-headless \
+ python3.12 \
+ python3-pip \
+ python3-psutil \
+ python3-venv \
pkg-config \
tzdata \
software-properties-common \
zlib1g-dev
-# Install Python 3.12
-RUN add-apt-repository ppa:deadsnakes/ppa
-RUN apt-get update && apt-get install -y \
- python3.12 \
- && apt-get autoremove --purge -y \
- && apt-get clean \
- && rm -rf /var/lib/apt/lists/*
-
-
-ARG BASIC_PIP_PKGS="numpy pyarrow>=22.0.0 six==1.16.0 pandas==2.3.3 scipy
plotly<6.0.0 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0
scikit-learn>=1.3.2"
-# Python deps for Spark Connect
+ARG BASIC_PIP_PKGS="numpy pyarrow>=22.0.0 six==1.16.0 pandas==2.3.3 scipy
plotly<6.0.0 mlflow>=2.8.1 matplotlib openpyxl memory-profiler>=0.61.0
scikit-learn>=1.3.2"
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 protobuf==6.33.0
googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20.3"
+ARG TESTING_PIP_PKGS="unittest-xml-reporting lxml coverage"
# Install Python 3.12 packages
-RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12
-RUN python3.12 -m pip install --ignore-installed 'blinker>=1.6.2' # mlflow
needs this
-RUN python3.12 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting
$CONNECT_PIP_PKGS lxml && \
+ENV VIRTUAL_ENV /opt/spark-venv
+RUN python3.12 -m venv $VIRTUAL_ENV
+ENV PATH="$VIRTUAL_ENV/bin:$PATH"
+
+RUN python3.12 -m pip install $BASIC_PIP_PKGS $CONNECT_PIP_PKGS
$TESTING_PIP_PKGS && \
python3.12 -m pip install torch torchvision --index-url
https://download.pytorch.org/whl/cpu && \
python3.12 -m pip install torcheval && \
python3.12 -m pip cache purge
diff --git a/python/pyspark/sql/tests/arrow/test_arrow_udf.py
b/python/pyspark/sql/tests/arrow/test_arrow_udf.py
index dcde871da811..f2bdf73f49a2 100644
--- a/python/pyspark/sql/tests/arrow/test_arrow_udf.py
+++ b/python/pyspark/sql/tests/arrow/test_arrow_udf.py
@@ -109,7 +109,7 @@ class ArrowUDFTestsMixin:
"America/Los_Angeles",
"Pacific/Honolulu",
"Europe/Amsterdam",
- "US/Pacific",
+ # "US/Pacific",
]:
with self.sql_conf({"spark.sql.session.timeZone": tz}):
# There is a time-zone conversion in df.collect:
@@ -145,10 +145,10 @@ class ArrowUDFTestsMixin:
return t
expected = [Row(ts=datetime.datetime(2019, 4, 12, 15, 50, 1))]
- self.assertEqual(expected, df.collect())
+ self.assertEqual(expected, df.collect(), tz)
result1 = df.select(identity("ts").alias("ts"))
- self.assertEqual(expected, result1.collect())
+ self.assertEqual(expected, result1.collect(), tz)
def identity2(iter):
for batch in iter:
@@ -157,7 +157,7 @@ class ArrowUDFTestsMixin:
yield batch
result2 = df.mapInArrow(identity2, "ts timestamp")
- self.assertEqual(expected, result2.collect())
+ self.assertEqual(expected, result2.collect(), tz)
def test_arrow_udf_wrong_arg(self):
with self.quiet():
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]