RFR: 8343546: GHA: Cache required dependencies in master-branch workflow

Aleksey Shipilev Fri, 04 Jul 2025 06:28:09 -0700

In our current GHA workflows, we only run workflows in branches in personal 
forks. GHA isolation rules say that workflow caches from the parent branches 
can be used by descendant branches. For our branches, the usual parent is 
`master`. Since we do not run workflows on `master`, this means every time we 
create a new branch, GHA would start with logically empty caches for it. Only 
the next trigger on the same branch would use the caches, saved from the first 
workflow run.


This means we put additional load on shared infrastructure with pulling JDKs, 
building jtreg (and pulling its dependencies), bootstrapping sysroots, etc. All 
these steps also fail intermittently every so often. It also means everyone 
carries lots of caches around, segregated by branch (look into your 
https://github.com/<github-name>/caches) only relying on cache cleanups when it 
starts to hit 10 GB. With 200+ contributors, this is easily 2 TB of cloud space 
we effectively waste in GHA clouds.

We can make all this more reliable, if we manage to run a master-branch 
workflow that bootstraps all required dependencies and caches them. These 
dependencies can then be used by PR branches, as "master" branch is their 
effective parent. 

This PR introduces the notion of "dry run", which does everything _except_ the 
actual builds and tests. Therefore, it verifies whether all dependencies are 
done properly for JDK configure to pass. This is useful in itself for future 
GHA debugging of dependencies. Workflow can be dispatched with additional "dry 
run" parameter now.

What makes master-branch caching possible is the second part of the PR that 
hooks up dry runs to master/stabilization branch pushes. These would make the 
dry-run workflow run every time you update your personal fork's 
master/stabilization branch. That dry run would likely finish very quickly if 
all caches are already in place. It would populate caches in 
master/stabilization branch in your personal fork, if not. 

The expected net result is that actual PRs that are branched off the personal 
fork master would be able to use the caches from that branch. (If you want to 
make this experiment in current GHA, trigger the existing workflow on `master` 
branch in your fork, it would do roughly the same, but with all builds/tests).

A sample "dry-run" can be seen here: 
https://github.com/shipilev/jdk/actions/runs/16074619302. The most heavy-weight 
part is MSYS2 unpacking in Windows builds, and that part is beyond our reach; 
the relevant GH action maintained by MSYS2 folks is responsible for managing 
caches for it.

-------------

Commit messages:
 - Single-script dry run
 - Initial version: dry-run from a separate file

Changes: https://git.openjdk.org/jdk/pull/26134/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26134&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8343546
  Stats: 99 lines in 7 files changed: 96 ins; 2 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/26134.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/26134/head:pull/26134

PR: https://git.openjdk.org/jdk/pull/26134

RFR: 8343546: GHA: Cache required dependencies in master-branch workflow

Reply via email to