Hey folks, I’m a bit late to this conversation but, Is it worth considering GitHub’s merge queue functionality (or similar offerings)? https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue
With it, we could run a lightweight set of tests that we expect to catch bugs in most cases. Then, once a PR is approved, the commit that would be pushed to main is created and a full test of tests can be run on it. If the run fails, the PR is put back as it was and it’s the author’s responsibility to resolve, nothing was committed to main. If it passes, the merge(/squash) is pushed to main. There can be multiple concurrent PRs in the queue and they’re all rebased on the previous one where it assumes the previous one will pass. This can solve both the reliance on a human mechanism for nightly failures but also avoid a whole class of bugs relating to bad merges. I don’t have the bandwidth to invest in this myself, but I wanted to raise it as a possible thing we can invest in. Danny From: Russell Spitzer <[email protected]> Reply to: "[email protected]" <[email protected]> Date: Tuesday, 30 June 2026 at 16:14 To: "[email protected]" <[email protected]> Subject: RE: [EXTERNAL] [DISCUSS] Reduce CI runner time by running JDK 21 only on main/nightly CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. I'm fine with this but who will get the alerts to fix the build after nightly failures? Just wondering what our human mechanism is for preventing further work / merges until the java 21 build passes. On Tue, Jun 30, 2026 at 10:08 AM Kevin Liu <[email protected]<mailto:[email protected]>> wrote: Hey folks, Bumping this thread. From the "Iceberg Consumption of ASF Shared GitHub-hosted Runners" thread [1], we are proposing to remove JDK 21 from pull_request CI runs, and only keep JDK 17. We will still run both JDK 17 and 21 for push to main, release branch, and tags. This will reduce the PR CI matrix by half for jobs that ran for both JDK 17 and 21. Here's the PR for the change [2], courtesy of Ajantha (Thank you!) Please take a look! Best, Kevin Liu [1] https://lists.apache.org/thread/5qno2fklfcxbqs1ckwdhdcjcsr2qg4ln [2] https://github.com/apache/iceberg/pull/16945 On Thu, Jun 11, 2026 at 5:25 AM Ajantha Bhat <[email protected]<mailto:[email protected]>> wrote: Hi, I already have a PR open to run regular PR builds only on JDK 17 and to add incremental CI builds: https://github.com/apache/iceberg/pull/16566 I haven’t received any review on it yet! The reason I chose JDK 17 instead of JDK 21 for regular PR builds is that JDK 17 is the lower supported Java baseline and the project’s bytecode target<https://github.com/apache/iceberg/blob/main/build.gradle#L226>. This gives us the best compatibility signal while reducing GitHub runner usage. To be clear, this does not remove JDK 21 coverage entirely. Builds on the main branch will still run with both JDK 17 and JDK 21, and PRs labeled full-ci will also use both JDK versions. Related mailing list thread: https://lists.apache.org/thread/36vxlql61gojbg639c86mnz78n57kvgm - Ajantha On Thu, Jun 11, 2026 at 4:23 PM Vova Kolmakov <[email protected]<mailto:[email protected]>> wrote: Hi all, Our PR CI currently runs the full test suite on both JDK 17 and JDK 21 for every heavy workflow (spark, flink, java, hive, kafka-connect, delta-conversion). This doubles PR runner-minutes on the shared ASF Actions pool. spark-ci alone expands to 22 matrix jobs, which exceeds the infra max-parallel ceiling of 20 and spills into a second wave. I would like to propose gating pull_request runs on JDK 17 only (our minimum supported version, and the JDK that already writes the shared Gradle cache), while keeping the full JDK 17 + 21 matrix on push to main, plus optionally a nightly scheduled full-matrix JDK 21 run. Concretely, the jvm matrix becomes event-conditional, for example: jvm: ${{ github.event_name == 'pull_request' && fromJSON('[17]') || fromJSON('[17, 21]') }} This roughly halves PR runner time across all of the heavy workflows and brings spark-ci back under the 20-job ceiling in a single wave. Caching is unaffected, since the canonical writer stays java-ci build-checks on JDK 17 on main. The tradeoff is that a JDK-21-only regression would surface at merge time or in the nightly run rather than on the PR itself. To bound that, we could keep a small JDK 21 smoke leg on PRs (for example core-tests only), and/or rely on a nightly full run. Does the project want to pursue this, and if so which variant: 17-only PRs with a nightly 21 run, or 17-only PRs plus a small 21 smoke subset? Thanks, Vova Kolmakov
