This is an automated email from the ASF dual-hosted git repository.

gengliangwang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 6fabcef2ff12 [SPARK-57108][INFRA] Skip core/utils build matrix entry 
for unrelated changes
6fabcef2ff12 is described below

commit 6fabcef2ff12f75aed18a8309a9efc3b465bf4ad
Author: Gengliang Wang <[email protected]>
AuthorDate: Wed May 27 14:15:56 2026 -0700

    [SPARK-57108][INFRA] Skip core/utils build matrix entry for unrelated 
changes
    
    ### What changes were proposed in this pull request?
    
    Splits the `core, unsafe, kvstore, avro, utils, utils-java, network-common, 
network-shuffle, repl, launcher, examples, sketch, variant` build matrix entry 
in `build_and_test.yml` into two:
    
    - `core, unsafe, kvstore, utils, utils-java, network-common, 
network-shuffle, sketch, variant, launcher` — foundational modules under 
`common/` (plus `launcher/`) whose only dependency is `tags`.
    - `avro, repl, examples` — modules that are transitive dependents of 
`sql`/`hive` and naturally need to run when those change.
    
    Adds a new `build-core-utils` precondition (computed via `is-changed.py` 
against just the foundational module list) and a matrix `exclude` rule that 
drops the first entry when `build-core-utils == 'false'`. The check is opt-out: 
missing/unset means run, so periodic full-build workflows that only set 
`"build": "true"` continue to run this entry unchanged.
    
    ### Why are the changes needed?
    
    Today, any PR that triggers the `build` job runs every matrix entry, 
including the foundational core/utils group, even when the PR only touches SQL 
or PySpark code. Because `is-changed.py` propagates changes forward (to 
dependents), SQL/PySpark changes never make core/utils tests stale, so the 
runner spend is wasted.
    
    After this change:
    
    - SQL-only / PySpark-only PR → `build-core-utils=false` → the core/utils 
runner is skipped. The `avro, repl, examples` runner still fires because those 
modules are transitive dependents of `sql`/`hive`.
    - Core/utils PR → `build-core-utils=true` → entry runs as before.
    - Periodic full-build workflows (e.g. `build_java21.yml`) → no 
`build-core-utils` key, opt-out semantics keeps the entry running.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Validated the workflow YAML parses and that the literal string used in the 
`exclude` expression matches the matrix entry string exactly (so the exclude 
rule actually fires).
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code
    
    Closes #56152 from gengliangwang/infra-skip-core-utils-build.
    
    Authored-by: Gengliang Wang <[email protected]>
    Signed-off-by: Gengliang Wang <[email protected]>
---
 .github/workflows/build_and_test.yml | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index fb55b970e387..6c2606f62683 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -152,9 +152,11 @@ jobs:
             java25=false
           fi
           build=`./dev/is-changed.py -m 
"core,unsafe,kvstore,avro,utils,utils-java,network-common,network-shuffle,repl,launcher,examples,sketch,variant,api,catalyst,hive-thriftserver,mllib-local,mllib,graphx,streaming,sql-kafka-0-10,streaming-kafka-0-10,streaming-kinesis-asl,kubernetes,hadoop-cloud,spark-ganglia-lgpl,profiler,protobuf,yarn,connect,sql,hive,pipelines"`
+          build_core_utils=`./dev/is-changed.py -m 
"core,unsafe,kvstore,utils,utils-java,network-common,network-shuffle,sketch,variant,launcher"`
           precondition="
             {
               \"build\": \"$build\",
+              \"build-core-utils\": \"$build_core_utils\",
               \"pyspark\": \"$pyspark\",
               \"pyspark-pandas\": \"$pandas\",
               \"pyspark-install\": \"$pyspark_install\",
@@ -283,16 +285,15 @@ jobs:
         # Note that the modules below are from sparktestsupport/modules.py.
         modules:
           - >-
-            core, unsafe, kvstore, avro, utils, utils-java,
-            network-common, network-shuffle, repl, launcher,
-            examples, sketch, variant
+            core, unsafe, kvstore, utils, utils-java,
+            network-common, network-shuffle, sketch, variant, launcher
           - >-
             api, catalyst, hive-thriftserver
           - >-
-            mllib-local, mllib, graphx, profiler, pipelines
+            mllib-local, mllib, graphx, profiler, pipelines, repl, examples
           - >-
             streaming, sql-kafka-0-10, streaming-kafka-0-10, 
streaming-kinesis-asl,
-            kubernetes, hadoop-cloud, spark-ganglia-lgpl, protobuf, connect
+            kubernetes, hadoop-cloud, spark-ganglia-lgpl, protobuf, connect, 
avro
           - yarn
         # Here, we split Hive and SQL tests into some of slow ones and the 
rest of them.
         included-tags: [""]
@@ -336,6 +337,10 @@ jobs:
           # In practice, the build will run in individual PR, but not against 
the individual commit
           # in Apache Spark repository.
           - modules: ${{ fromJson(needs.precondition.outputs.required).yarn != 
'true' && 'yarn' }}
+          # Skip the core/utils group when a PR doesn't touch those modules 
(e.g. SQL-only or
+          # PySpark-only changes). The precondition is opt-out: omitted means 
run (so periodic
+          # full-build workflows that set only "build": "true" keep running 
this entry).
+          - modules: ${{ 
fromJson(needs.precondition.outputs.required).build-core-utils == 'false' && 
'core, unsafe, kvstore, utils, utils-java, network-common, network-shuffle, 
sketch, variant, launcher' }}
     env:
       MODULES_TO_TEST: ${{ matrix.modules }}
       EXCLUDED_TAGS: ${{ matrix.excluded-tags }}


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to