andygrove opened a new pull request, #4410:
URL: https://github.com/apache/datafusion-comet/pull/4410

   ## Which issue does this PR close?
   
   Part of #4406.
   
   ## Rationale for this change
   
   The `pr_build_linux` and `pr_build_macos` workflows run a `profiles × 
suites` test matrix. Two suite buckets, `csv` and `sql`, ran only 1 and 2 test 
suites respectively. Every matrix job pays a fixed ~5 min overhead (checkout, 
toolchain setup, native-library download, and a full `mvnw install` recompile), 
so those buckets were almost entirely overhead: 10 near-empty jobs on Linux and 
6 on macOS.
   
   The `fuzz` bucket was organized by test type while every other bucket was 
organized by functional area, an inconsistency. And the triggers used 
`paths-ignore`, so a change to `pr_build_macos.yml` triggered 
`pr_build_linux.yml` and vice versa.
   
   Per-suite timing was measured from a recent `main` run to size the new 
buckets so that no job runs much longer than today's ~22 min ceiling.
   
   ## What changes are included in this PR?
   
   - Reorganize the test matrix from 7 suite buckets to 4, grouped by 
functional area: `scans`, `shuffle`, `exec`, `expressions`. The `fuzz`, `csv`, 
and `sql` buckets are dissolved into these; each fuzz suite moves next to the 
area it exercises (e.g. `CometFuzzAggregateSuite` into `exec`). The Linux 
matrix drops from 35 to 20 test jobs; macOS suites go from 7 to 4. The full 
suite set is unchanged, so `dev/ci/check-suites.py` still passes.
   - Merge the 3-job TPC-DS verification matrix (`sort_merge`, `broadcast`, 
`hash`) into a single job that builds the project once and runs all three join 
strategies in sequence.
   - Switch workflow triggers from `paths-ignore` to a `paths` allow-list 
covering all source trees, build tooling, and the shared composite actions. 
Editing `pr_build_macos.yml` no longer triggers the Linux workflow and vice 
versa; shared composite actions still trigger both.
   - Remove the dead `Spark 3.4 + sql` special case, whose `&&` / `||` 
expression always evaluated to `matrix.suite.value`.
   
   The macOS profile matrix is intentionally left untouched here; it is handled 
separately in #4409.
   
   ## How are these changes tested?
   
   The workflow files were validated with `actionlint` (no new findings) and 
parsed as YAML. Suite coverage was checked by diffing the set of suite class 
names against `main`: it is identical, so no suite was dropped, and 
`pr_missing_suites` still passes. CI on this PR exercises the new 4-bucket 
matrix end to end.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to