andygrove opened a new pull request, #4410: URL: https://github.com/apache/datafusion-comet/pull/4410
## Which issue does this PR close? Part of #4406. ## Rationale for this change The `pr_build_linux` and `pr_build_macos` workflows run a `profiles × suites` test matrix. Two suite buckets, `csv` and `sql`, ran only 1 and 2 test suites respectively. Every matrix job pays a fixed ~5 min overhead (checkout, toolchain setup, native-library download, and a full `mvnw install` recompile), so those buckets were almost entirely overhead: 10 near-empty jobs on Linux and 6 on macOS. The `fuzz` bucket was organized by test type while every other bucket was organized by functional area, an inconsistency. And the triggers used `paths-ignore`, so a change to `pr_build_macos.yml` triggered `pr_build_linux.yml` and vice versa. Per-suite timing was measured from a recent `main` run to size the new buckets so that no job runs much longer than today's ~22 min ceiling. ## What changes are included in this PR? - Reorganize the test matrix from 7 suite buckets to 4, grouped by functional area: `scans`, `shuffle`, `exec`, `expressions`. The `fuzz`, `csv`, and `sql` buckets are dissolved into these; each fuzz suite moves next to the area it exercises (e.g. `CometFuzzAggregateSuite` into `exec`). The Linux matrix drops from 35 to 20 test jobs; macOS suites go from 7 to 4. The full suite set is unchanged, so `dev/ci/check-suites.py` still passes. - Merge the 3-job TPC-DS verification matrix (`sort_merge`, `broadcast`, `hash`) into a single job that builds the project once and runs all three join strategies in sequence. - Switch workflow triggers from `paths-ignore` to a `paths` allow-list covering all source trees, build tooling, and the shared composite actions. Editing `pr_build_macos.yml` no longer triggers the Linux workflow and vice versa; shared composite actions still trigger both. - Remove the dead `Spark 3.4 + sql` special case, whose `&&` / `||` expression always evaluated to `matrix.suite.value`. The macOS profile matrix is intentionally left untouched here; it is handled separately in #4409. ## How are these changes tested? The workflow files were validated with `actionlint` (no new findings) and parsed as YAML. Suite coverage was checked by diffing the set of suite class names against `main`: it is identical, so no suite was dropped, and `pr_missing_suites` still passes. CI on this PR exercises the new 4-bucket matrix end to end. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
