This is an automated email from the ASF dual-hosted git repository. github-merge-queue[bot] pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/texera.git
commit e4f1077a238d491e528b1bb401885f6e82b6274a Author: Matthew B. <[email protected]> AuthorDate: Tue Jun 23 14:15:17 2026 -0700 fix(ci): trim benchmark full grid to fit daily run under 6h timeout (#5905) ### What changes were proposed in this PR? - Drop `batchSize=10000` from the `full`-mode benchmark grid in `ArrowFlightActorBench.scala`, taking the daily sweep from 36 configs to 27 and removing the 9 heaviest configs (30-70 min each) that pushed the run past GitHub's 6h job ceiling. - Update the now-stale "36-config / ~50-60 min" comments to "27-config / ~40 min" in the bench source and `benchmarks.yml`. ### Any related issues, documentation, discussions? Closes: #5904 ### How was this PR tested? - Non-functional change (benchmark harness grid + CI comments); no shipped behavior and no unit test covers the bench grid contents. - CI timing verification: trigger the `Benchmarks` workflow via `workflow_dispatch` on this branch (the only non-schedule trigger that runs `full` mode) and confirm the `Bench` job finishes well under 6h (expected ~40-50 min including compile/setup), reaching the publish steps. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.8 in compliance with ASF --- .github/workflows/benchmarks.yml | 8 ++++---- .../org/apache/texera/amber/bench/ArrowFlightActorBench.scala | 11 ++++++++--- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/.github/workflows/benchmarks.yml b/.github/workflows/benchmarks.yml index c74d9cfe48..9d6d62672e 100644 --- a/.github/workflows/benchmarks.yml +++ b/.github/workflows/benchmarks.yml @@ -45,7 +45,7 @@ # job summary plus uploaded artifact. Publishing on every merge spammed # the repo's Pulse / all-branches commit count with bot commits, so # only the scheduled (daily) run persists the baseline now. -# - schedule (daily): runs the full 36-config sweep and is the sole +# - schedule (daily): runs the full 27-config sweep and is the sole # writer that publishes to gh-pages (the authoritative long-term # baseline). # - workflow_dispatch: manual full-grid run (no publish; bring-your-own @@ -53,7 +53,7 @@ # # Two modes via BENCH_MODE env (read by the bench Scala main): # pr — 3 configs × 20 batches, ~5 min (PR + push-to-main) -# full — 36 configs × 200 batches, ~50-60 min (schedule + dispatch) +# full — 27 configs × 200 batches, ~40 min (schedule + dispatch) # # Non-blocking: this workflow is NOT included in required-checks.yml's # `required-checks` aggregator, so its result doesn't gate merges even @@ -76,7 +76,7 @@ on: schedule: # Daily full-grid baseline refresh, 12:00 UTC (05:00 PDT). PR and # post-merge runs use a trimmed 3-config grid to stay around 5 min; the - # scheduled run covers the full 36-config sweep that the gh-pages + # scheduled run covers the full 27-config sweep that the gh-pages # dashboard tracks long-term. Daily (rather than weekly) keeps the # baseline fresh and accumulates enough data points to average out CI # noise; the extra bot commits on gh-pages are intentionally tolerated. @@ -178,7 +178,7 @@ jobs: JAVA_OPTS: -Xms2048M -Xmx2048M -Xss6M -XX:ReservedCodeCacheSize=256M -Dfile.encoding=UTF-8 JVM_OPTS: -Xms2048M -Xmx2048M -Xss6M -XX:ReservedCodeCacheSize=256M -Dfile.encoding=UTF-8 # `pr` mode = 3-config trimmed sweep (~5 min) for PR + post-merge. - # `full` mode = 36-config sweep (~50-60 min) for schedule + manual. + # `full` mode = 27-config sweep (~40 min) for schedule + manual. # Read by the bench Scala main (see GridSpec switch); workflow only # decides which mode to pass. BENCH_MODE: ${{ (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && 'full' || 'pr' }} diff --git a/amber/src/bench/scala/org/apache/texera/amber/bench/ArrowFlightActorBench.scala b/amber/src/bench/scala/org/apache/texera/amber/bench/ArrowFlightActorBench.scala index 79d0c8cd7d..0109733589 100644 --- a/amber/src/bench/scala/org/apache/texera/amber/bench/ArrowFlightActorBench.scala +++ b/amber/src/bench/scala/org/apache/texera/amber/bench/ArrowFlightActorBench.scala @@ -92,9 +92,14 @@ object ArrowFlightActorBench { // Sweep grid + iteration counts switch on BENCH_MODE so PR / post-merge // checks stay around 5 min while scheduled / manual runs do the full - // 36-config grid that the gh-pages dashboard tracks long-term. + // 27-config grid that the gh-pages dashboard tracks long-term. // pr — 3 configs × 20 batches, warmup 5 (~4-5 min in CI) - // full — 36 configs × 200 batches, warmup 20 (~50-60 min in CI) + // full — 27 configs × 200 batches, warmup 20 (~40 min in CI) + // The batchSize=10000 row was dropped from the full grid: its 9 configs + // (3 schemaWidths x 3 stringLens) ran 30-70 min EACH, pushing the daily + // run past GitHub's 6 h job ceiling so it timed out before publishing to + // gh-pages. The remaining 10/100/1000 rows are ~10-1000x cheaper per + // batch, keeping the full sweep well under an hour. // BENCH_NUM_BATCHES, if set, overrides numBatches for the current mode // (useful for local smoke). private val BenchMode: String = sys.env.getOrElse("BENCH_MODE", "full").toLowerCase @@ -118,7 +123,7 @@ object ArrowFlightActorBench { ) case _ => GridSpec( - batchSizes = Seq(10, 100, 1000, 10000), + batchSizes = Seq(10, 100, 1000), schemaWidths = Seq(1, 10, 50), stringLens = Seq(8, 64, 512), numBatches = sys.env.get("BENCH_NUM_BATCHES").map(_.toInt).getOrElse(200),
