Re: RFR: 8343546: GHA: Cache required dependencies in master-branch workflow

Magnus Ihse Bursie Fri, 04 Jul 2025 07:27:32 -0700

On Fri, 4 Jul 2025 13:21:40 GMT, Aleksey Shipilev <sh...@openjdk.org> wrote:


> In our current GHA workflows, we only run workflows in branches in personal 
> forks. GHA isolation rules say that workflow caches from the parent branches 
> can be used by descendant branches. For our branches, the usual parent is 
> `master`. Since we do not run workflows on `master`, this means every time we 
> create a new branch, GHA would start with logically empty caches for it. Only 
> the next trigger on the same branch would use the caches, saved from the 
> first workflow run.
> 
> This means we put additional load on shared infrastructure with pulling JDKs, 
> building jtreg (and pulling its dependencies), bootstrapping sysroots, etc. 
> All these steps also fail intermittently every so often. It also means 
> everyone carries lots of caches around, segregated by branch and repo (look 
> into your https://github.com/your-github-name/jdk/actions/caches, for 
> example) only relying on cache cleanups when it starts to hit 10 GB. With 
> hundreds of contributors, this easily wastes terabytes of cloud storage space.
> 
> We can make all this more efficient and reliable, if we manage to run a 
> master-branch workflow that bootstraps all required dependencies and caches 
> them. These dependencies can then be used by PR branches, as "master" branch 
> is their effective parent. 
> 
> This PR introduces the notion of "dry run", which does everything _except_ 
> the actual builds and tests. Therefore, it verifies whether all dependencies 
> are done properly for JDK configure to pass. This is useful in itself for 
> future GHA debugging of dependencies. Workflow can be dispatched with 
> additional "dry run" parameter now.
> 
> What makes master-branch caching possible is the second part of the PR that 
> hooks up dry runs to master/stabilization branch pushes. These would make the 
> dry-run workflow run every time you update your personal fork's 
> master/stabilization branch. That dry run would likely finish very quickly if 
> all caches are already in place. It would populate caches in 
> master/stabilization branch in your personal fork, if not. 
> 
> The expected net result is that actual PRs that are branched off the personal 
> fork master would be able to use the caches from that master workflow run. 
> (If you want to make this experiment in current GHA, trigger the existing 
> workflow on `master` branch in your fork, it would do roughly the same, but 
> with all builds/tests).
> 
> A sample "dry-run" can be seen here: 
> https://github.com/shipilev/jdk/actions/runs/16074619302. The most 
> heavy-weight part is MSYS2 unpacking in Windows builds, and t...

It looks like you managed to get this down to a clear and nice improvement. 
Just remove the debug code and you're good to go :)

.github/workflows/main.yml line 172:

> 170:               fi
> 171: 
> 172:               # FOR TESTING, REMOVE BEFORE INTEGRATION

^^^

-------------

Marked as reviewed by ihse (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/26134#pullrequestreview-2987438070
PR Review Comment: https://git.openjdk.org/jdk/pull/26134#discussion_r2185492128

Re: RFR: 8343546: GHA: Cache required dependencies in master-branch workflow

Reply via email to