Hi, Reviewing the picture today, the GitHub hosted runners utilisation overall is still maxxing out daily. WIth respect to datafusion we can see over the last 7 days that the project has dropped out of the top 5 to 7 which is good progress. Your continued work and vigilence on this area helps all projects and is appreciated.
Please note that there is now a page for sharing tips and best practice for optimising workflows with respect to utilisation: https://cwiki.apache.org/confluence/display/INFRA/GitHub+Actions+Recommended+Practices There is also a Slack channel for community discussion and sharing ideas: project-workflow-optimisations Thanks. Kind regards, -Bob Thomson, ASF Infrastructure On 2026/06/02 06:41:09 Robert Thomson wrote: > Hi, > > Yes, I can see Datafusion down to number 2 in the list of top 5 consumers > now. The top 5 consumers still account for nearly 1/3 of the total > consumption, so continued consideration and vigilance to keep it to a > minimum is appreciated, as such effort benefits all ASF projects of course: > 885/900 jobs in use as I write this. > > Thanks. > > Kind regards, > -Bob Thomson, > ASF Infrastructure > > > On Sat, May 30, 2026 at 2:23 AM Andrew Lamb <[email protected]> wrote: > > > Adding Bob back on cc in case he is not monitoring the dev list > > > > Andrew > > > > On Fri, May 29, 2026 at 1:53 PM Oleks V. <[email protected]> wrote: > > > >> Thank you, Bob, for raising this concern regarding the consumption of ASF > >> shared resources. > >> > >> The Apache DataFusion team has already addressed several low-hanging > >> optimizations, and as a result, daily CI resource consumption has been > >> reduced by approximately 50%. 24Hr usage decreased from 113K minutes a week > >> ago to 55K minutes over the last 24 hours. > >> > >> These changes were introduced over the past few days, and the most recent > >> 5-day report is already approaching the threshold defined in the ASF > >> policy: > >> https://infra.apache.org/github-actions-policy.html > >> > >> Specifically, the latest 5-day report shows 248K minutes compared to the > >> 216K threshold, with a clear downward trend continuing as the recent > >> optimizations fully propagate through the reporting window. > >> > >> In parallel, we are actively working on longer-term improvements that > >> will further reduce CI overhead while maintaining project quality and > >> contributor experience. > >> > >> Some of the options currently being evaluated include: > >> > >> * Offloading long-running CI pipelines to contributor forks, similar to > >> the approach used by Apache Spark: > >> https://github.com/apache/spark/tree/master/.github > >> This would shift execution of certain workflows from ASF-hosted > >> resources to contributor-owned resources. > >> * Refining test matrices to reduce redundant workflow combinations. > >> * Introducing a gated execution model where selected CI pipelines are > >> triggered manually instead of automatically for every commit. > >> > >> We would also appreciate any additional recommendations you may have > >> based on your experience managing CI resource utilization across ASF > >> projects. > >> > >> > >> Given the significant reduction already achieved and the continuing > >> downward trend, would it be possible to extend the current deadline to > >> allow the team sufficient time to implement the longer-term optimizations > >> without negatively impacting project quality, contributor confidence and > >> user trust? > >> > >> We appreciate your support and look forward to your feedback. > >> > >> Oleks V > >> > >> On 2026/05/22 11:44:16 Robert Thomson wrote: > >> > Hello, Datafusion PMC. > >> > > >> > In 2024, the ASF introduced the policy for GitHub Actions usage > >> > across the foundation[1]. The ASF Github shared pool of > >> > Github-hosted runners has been at, or very close to the limit of > >> > 900 jobs most of the time in the past few weeks and this is the > >> > case again today. > >> > > >> > Your project has been identified as being among the top 5 consumers of > >> > build time over the past 7 days and we request that you bring your > >> > usage down by stream-lining long-running builds. Contact Infra for > >> > a consultation if you are unable to streamline your builds further. > >> > > >> > You can use the infra reporting tool[2] to monitor your GHA usage as you > >> > work on stream-lining, as well as locate any bottlenecks in the > >> workflows. > >> > > >> > Infra will allow you two weeks time (till the 8th of June, 2026) to > >> > progress this, but should you still be above the limits by then, > >> > without a viable path forward, we will be limiting your GHA usage. > >> > > >> > Kind regards, > >> > Bob Thomson, on behalf of ASF Infrastructure. > >> > > >> > > >> > [1] https://infra.apache.org/github-actions-policy.html > >> > [2] > >> > > >> https://infra-reports.apache.org/#ghactions&project=datafusion&hours=24&limit=15&group=name > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > >> > >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
