This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.4 by this push:
new 7ff70b11248 [SPARK-44544][INFRA][FOLLOWUP] Force run
`run_python_packaging_tests`
7ff70b11248 is described below
commit 7ff70b11248462352ca23e41ae70cd18dc2db0ba
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu Jul 27 16:16:03 2023 +0800
[SPARK-44544][INFRA][FOLLOWUP] Force run `run_python_packaging_tests`
### What changes were proposed in this pull request?
run `run_python_packaging_tests` when there are any changes in PySpark
### Why are the changes needed?
https://github.com/apache/spark/pull/42146 make CI run
`run_python_packaging_tests` only within `pyspark-errors` (see
https://github.com/apache/spark/actions/runs/5666118302/job/15359190468 and
https://github.com/apache/spark/actions/runs/5668071930/job/15358091003)

but I ignored that `pyspark-errors` maybe skipped (because no related
source changes), so the `run_python_packaging_tests` maybe also skipped
unexpectedly (see
https://github.com/apache/spark/actions/runs/5666523657/job/15353485731)

this PR is to run `run_python_packaging_tests` even if `pyspark-errors` is
skipped
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
updated CI
Closes #42173 from zhengruifeng/infra_followup.
Lead-authored-by: Ruifeng Zheng <[email protected]>
Co-authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit f7947341ab2984113018f7f7014bb8373a3cb3b1)
Signed-off-by: Ruifeng Zheng <[email protected]>
---
dev/sparktestsupport/modules.py | 9 ++++++++-
dev/sparktestsupport/utils.py | 2 +-
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/dev/sparktestsupport/modules.py b/dev/sparktestsupport/modules.py
index fd18ddd6d13..ac24ea19d0e 100644
--- a/dev/sparktestsupport/modules.py
+++ b/dev/sparktestsupport/modules.py
@@ -778,7 +778,14 @@ pyspark_pandas_slow = Module(
pyspark_errors = Module(
name="pyspark-errors",
dependencies=[],
- source_file_regexes=["python/pyspark/errors"],
+ source_file_regexes=[
+ # SPARK-44544: Force the execution of pyspark_errors when there are
any changes
+ # in PySpark, since the Python Packaging Tests is only enabled within
this module.
+ # This module is the smallest Python test module, it contains only 1
test file
+ # and normally takes < 2 seconds, so the additional cost is small.
+ "python/",
+ "python/pyspark/errors",
+ ],
python_test_goals=[
# unittests
"pyspark.errors.tests.test_errors",
diff --git a/dev/sparktestsupport/utils.py b/dev/sparktestsupport/utils.py
index 6b190eb5ab2..19e6d8917e6 100755
--- a/dev/sparktestsupport/utils.py
+++ b/dev/sparktestsupport/utils.py
@@ -38,7 +38,7 @@ def determine_modules_for_files(filenames):
and `README.md` is always ignored too.
>>> sorted(x.name for x in
determine_modules_for_files(["python/pyspark/a.py", "sql/core/foo"]))
- ['pyspark-core', 'sql']
+ ['pyspark-core', 'pyspark-errors', 'sql']
>>> [x.name for x in
determine_modules_for_files(["file_not_matched_by_any_subproject"])]
['root']
>>> [x.name for x in determine_modules_for_files(["appveyor.yml",
"sql/README.md"])]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]