Thank you, Bob, for raising this concern regarding the consumption of ASF 
shared resources.

The Apache DataFusion team has already addressed several low-hanging 
optimizations, and as a result, daily CI resource consumption has been reduced 
by approximately 50%. 24Hr usage decreased from 113K minutes a week ago to 55K 
minutes over the last 24 hours.

These changes were introduced over the past few days, and the most recent 5-day 
report is already approaching the threshold defined in the ASF policy:
https://infra.apache.org/github-actions-policy.html

Specifically, the latest 5-day report shows 248K minutes compared to the 216K 
threshold, with a clear downward trend continuing as the recent optimizations 
fully propagate through the reporting window.

In parallel, we are actively working on longer-term improvements that will 
further reduce CI overhead while maintaining project quality and contributor 
experience.

Some of the options currently being evaluated include:

* Offloading long-running CI pipelines to contributor forks, similar to the 
approach used by Apache Spark:
  https://github.com/apache/spark/tree/master/.github
  This would shift execution of certain workflows from ASF-hosted resources to 
contributor-owned resources.
* Refining test matrices to reduce redundant workflow combinations.
* Introducing a gated execution model where selected CI pipelines are triggered 
manually instead of automatically for every commit.

We would also appreciate any additional recommendations you may have based on 
your experience managing CI resource utilization across ASF projects.


Given the significant reduction already achieved and the continuing downward 
trend, would it be possible to extend the current deadline to allow the team 
sufficient time to implement the longer-term optimizations without negatively 
impacting project quality, contributor confidence and user trust?

We appreciate your support and look forward to your feedback.

Oleks V

On 2026/05/22 11:44:16 Robert Thomson wrote:
> Hello, Datafusion PMC.
> 
> In 2024, the ASF introduced the policy for GitHub Actions usage
> across the foundation[1]. The ASF Github shared pool of
> Github-hosted runners has been at, or very close to the limit of
> 900 jobs most of the time in the past few weeks and this is the
> case again today.
> 
> Your project has been identified as being among the top 5 consumers of
> build time over the past 7 days and we request that you bring your
> usage down by stream-lining long-running builds. Contact Infra for
> a consultation if you are unable to streamline your builds further.
> 
> You can use the infra reporting tool[2] to monitor your GHA usage as you
> work on stream-lining, as well as locate any bottlenecks in the workflows.
> 
> Infra will allow you two weeks time (till the 8th of June, 2026) to
> progress this, but should you still be above the limits by then,
> without a viable path forward, we will be limiting your GHA usage.
> 
> Kind regards,
> Bob Thomson, on behalf of ASF Infrastructure.
> 
> 
> [1] https://infra.apache.org/github-actions-policy.html
> [2]
> https://infra-reports.apache.org/#ghactions&project=datafusion&hours=24&limit=15&group=name
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to