Hi Bob, Just trying to disambiguate something… are we at #7 for overall use, or on a per-PR basis?
It’s a little hard to separate the problem from a per-run basis (optimizing) vs total use (i.e. lots of PRs flying around causing a larger overall load). I’m hoping things have gotten significantly better per-PR, but the repo has been a hive of activity lately, keeping us in a top overall spot. Still trying to think of places to cut that aren’t detrimental to stability or devex. We’ll keep whittling, and anyone reading this is welcome to provide ideas/suggestions :) Thanks, -e- Evan Rusackas Preset | preset.io Apache Superset PMC On Jun 11, 2026 at 1:55 AM -0700, Bob Thomson <[email protected]>, wrote: > Hi, > > Reviewing the picture today, the GitHub hosted runners utilisation overall is > still maxxing out daily. WIth respect to superset we can see over the last 7 > days that the project has dropped out of the top 5 to 7 which is good > progress. Your continued work and vigilance on this area helps all projects > and is appreciated. > > Please note that there is now a page for sharing tips and best practice for > optimising workflows with respect to utilisation: > > https://cwiki.apache.org/confluence/display/INFRA/GitHub+Actions+Recommended+Practices > > There is also a Slack channel for community discussion and sharing ideas: > project-workflow-optimisations > > Thanks. > > Kind regards, > -Bob Thomson, > ASF Infrastructure > > > On 2026/06/05 08:09:51 Robert Thomson wrote: > > Thanks Evan, all sounds like great work, hopefully will make a dent in the > > jobs in use. > > > > Kind regards, > > -Bob Thomson, > > ASF Infrastructure > > > > > > On Thu, Jun 4, 2026 at 9:25 PM Evan Rusackas <[email protected]> wrote: > > > > > Thanks for the tips. > > > > > > For your first suggestion, we took a different route to the same goal, > > > using a change-detector action and job-level gating, which has advantages > > > for our setup, but we do use “paths:” in several workflows. > > > > > > For the second one, we are using concurrency & cancel-in-progress, so all > > > set there. However, we’re using “github.run_id” rather than “github.ref” > > > there, since on push events, run_id lets every commit to master get fully > > > validated, whereas ref would cancel in-progress master validations when > > > commits land back-to-back (happening an awful lot right now). > > > > > > All the important PRs mentioned in my last email have landed, and we’re > > > just doing touch-ups now. Hopefully the situation has drastically > > > improved, > > > though ironically, a ton of PRs need rebasing now, so pardon the CI churn > > > while we do so with the current backlog. > > > > > > Thanks again, > > > > > > -e- > > > > > > *Evan Rusackas* > > > Preset | preset.io > > > On Jun 4, 2026 at 1:09 AM -0700, Bob Thomson <[email protected]>, wrote: > > > > > > I have been experimenting with pointing Gemini at public repos and > > > prompting: > > > > > > "Analyse the GitHub Actions workflows in this repo > > > https://github.com/apache/PROJECT/tree/master/.github and report on > > > possible causes of long run time/high number of runs of GitHub Actions" > > > > > > One output here was: > > > > > > The Problem: Changes to frontend UI files (.ts, .tsx, .less) frequently > > > trigger backend Python unit test runs, and vice versa. Unless paths are > > > explicitly managed on every configuration entry, the entire testing suite > > > runs for micro-commits affecting only one side of the stack. > > > The Fix: Workflows must feature distinct path-routing restrictions: > > > > > > And the suggestion change was: > > > > > > # For frontend workflows > > > on: > > > pull_request: > > > paths: > > > - 'superset-frontend/**' > > > > > > I am no expert on Actions or this project, but thought I'd pass it on in > > > case it is helpful. > > > > > > A second one was: > > > > > > concurrency: > > > group: ${{ github.workflow }}-${{ github.event.pull_request.number || > > > github.ref }} > > > cancel-in-progress: true > > > > > > Which is said to ensure that, when a PR is opened and workflows are > > > running for it, and a further new commit is made to the same PR, the old > > > runs from the first commit are then cancelled - otherwise an open PR that > > > gets 3 more commits pushed, resulst in 3 lots of workflows running for the > > > one PR, 2 of which are redundant. > > > > > > Hope these are useful, or at least food for thought on other possible > > > steamlining improvements. > > > > > > Kind regards, > > > -Bob Thomson > > > > > > > > > On 2026/06/03 18:22:55 Evan Rusackas wrote: > > > > > > Hi Bob (and all) > > > > > > Thanks for the heads up on this. I just opened a swath of PRs that should > > > cut this down significantly. I’m working with PMC members to > > > assess/touch-up/review/merge: > > > > > > > > > 1. This PR takes us from 6 Cypress runners down to 5, and takes > > > the /app/prefix smoke test (only running on master now) down from 2 > > > runners > > > to 1. https://github.com/apache/superset/pull/40717 > > > 2. Cypress runners were all spinning up BEFORE they checked to see if they > > > were needed. This should fix that: > > > https://github.com/apache/superset/pull/40718 > > > 3. Gating E2E behind pre-commit. That's such a common failure that we > > > probably needn't test E2E until it passes. See the caveats here, there are > > > some visibility and fork-based PR caveats: > > > https://github.com/apache/superset/pull/40719 > > > 4. run unit/integration tests on CURRENT python version on PRs, and full > > > version matrix (3.10-3.12) on master: > > > https://github.com/apache/superset/pull/40722 > > > 5. Don't run CodeQL checks on docs-only changes: > > > https://github.com/apache/superset/pull/40724 > > > 6. Cancel-in-progress on a few things that churn needlessly on every > > > commit: https://github.com/apache/superset/pull/40725 > > > 7. Only build docker on docker-relevant changes: > > > https://github.com/apache/superset/pull/40723 > > > > > > There’s an alternate (radical) solution of just NOT running E2E tests on > > > PRs, but only running them on master. Sure would “nip it in the bud” cost > > > wise, but has potential repercussions if we don’t keep a close eye on CI > > > on > > > `master` > > > > > > TL;DR: We’re whittling, and will ask for fresh reports (in private ASF > > > Slack channels, probably) for impact results. > > > > > > > > > -e- > > > > > > Evan Rusackas > > > Preset | preset.io > > > On Jun 3, 2026 at 10:29 AM -0700, Bob Thomson <[email protected]>, > > > wrote: > > > > > > Fewer parallel runs is essential yes - we are at 900/900 GitHub hosted > > > runner jobs/slots just now and looking at Superset Actions we can see > > > nearly 500 completed Supeset repo action runs in the last hour, some of > > > those are up to 25 minutes in execution time, so anything that can be done > > > to reduce the share of runner jobs used by Superset is an urgent issue > > > when > > > we are at max jobs on runners on a daily basis now. > > > > > > Thanks. > > > > > > Kind regards, > > > -Bob Thomson, > > > ASF Infrastructure > > > > > > On 2026/05/22 19:54:16 Evan Rusackas wrote: > > > > > > Hi Bob (and everyone here), > > > > > > Thanks for the alert. The unfortunate thing is that this will only get > > > worse as we create/fix more things (security, dependabot, etc). Things > > > only > > > seem to be ramping up. > > > > > > So, agreed, we must whittle. Cypress is the obvious killer (about half the > > > consumption). We’ll try to find ways to whittle away at this (we’re > > > migrating to Playwright, but it takes time). We might also be able to > > > spend > > > less compute and more time by optimizing (or removing) some > > > parallelization > > > here. > > > > > > We’re also looking at moving from dependabot for all dependency bumps (a > > > LOT of PRs) to `renovate` - which might optimize things a bit (bumping > > > dependencies in groups) but we will need to also leave dependabot in place > > > for security-driven fixes as well. > > > > > > As for Cypress tests, we have some “martixification” happening, that I > > > think we can optimize. For the Superset folks reading this, I think we can > > > split out the “app_root” tests to JUST run on merges to `master` rather > > > than every PR. That’ll save ~50% right there, we just have to keep a > > > better > > > eye on CI on `master` (which we haven’t been great about historically, but > > > we’re getting better). > > > > > > Here’s the app_root PR https://github.com/apache/superset/pull/40385 > > > > > > We can also reduce the E2E parallelization shards from 6 to… I dunno… 3 or > > > 4. That’ll save a fair bit of setup time spinning up Superset instances. > > > Tests will run a bit longer, but consume less overall. Seems like a fair > > > tradeoff. > > > > > > Open to other ideas… maybe running fewer GHA workflows in parallel, and > > > having things more sequentially to fail faster (like nothing runs until > > > pre-commit passes, for example). > > > > > > Also, least importantly, we don’t have the access to see how we stack up > > > against other projects, but I sure am curious. > > > > > > Anyone's thoughts/PRs welcomed. > > > > > > Evan Rusackas > > > Preset | preset.io > > > On May 22, 2026 at 4:46 AM -0700, Robert Thomson <[email protected]>, > > > wrote: > > > > > > Hello, Superset PMC. > > > > > > In 2024, the ASF introduced the policy for GitHub Actions usage > > > across the foundation[1]. The ASF Github shared pool of > > > Github-hosted runners has been at, or very close to the limit of > > > 900 jobs most of the time in the past few weeks and this is the > > > case again today. > > > > > > Your project has been identified as being among the top 5 consumers of > > > build time over the past 7 days and we request that you bring your > > > usage down by stream-lining long-running builds. Contact Infra for > > > a consultation if you are unable to streamline your builds further. > > > > > > You can use the infra reporting tool[2] to monitor your GHA usage as you > > > work on stream-lining, as well as locate any bottlenecks in the workflows. > > > > > > Infra will allow you two weeks time (till the 8th of June, 2026) to > > > progress this, but should you still be above the limits by then, > > > without a viable path forward, we will be limiting your GHA usage. > > > > > > Kind regards, > > > Bob Thomson, on behalf of ASF Infrastructure. > > > > > > > > > [1] https://infra.apache.org/github-actions-policy.html > > > [2] > > > > > > https://infra-reports.apache.org/#ghactions&project=superset&hours=24&limit=15&group=name > > > > > > > > > > > > > >
