Hi all, I have now created multiple small PRs for the easy and big wins. - https://github.com/apache/iceberg/pull/16945 CI: Use one Java version for PR checks - https://github.com/apache/iceberg/pull/16946 CI: Select Spark PR matrix by changed version - https://github.com/apache/iceberg/pull/16947 CI: Select Flink PR matrix by changed version
Please take a look. I will have two or three more follow up PRs after this to handle full ci flag and other module incremental builds from the original PR: https://github.com/apache/iceberg/pull/16566 - Ajantha On Fri, Jun 19, 2026 at 9:12 PM Manu Zhang <[email protected]> wrote: > Ajantha, sorry I missed your early email. It will be great to split your > PR and get the enhancements for Spark CI or Flink CI in first. > > Kevin, that's good news! > >> CI should generally run by default for relevant changes, with explicit >> opt-outs where appropriate. > > Agreed. I believe there are still low hanging fruits we can pick based on > Ajantha and others' PRs. > > Thanks, > Manu > > On Fri, Jun 19, 2026 at 2:17 AM Kevin Liu <[email protected]> wrote: > >> Thanks everyone for all the contributions to reduce CI resource usage. >> I've seen a number of improvements go in already. I just checked the >> apache dashboard, it looks like we're still under the ceiling set by ASF, >> for both 5 day and 7 day periods. >> >> There's definitely more room for improvement. But I think we should >> prioritize correctness and coverage. I would also like to focus on >> maintainability and avoid patterns that require ongoing manual >> maintenance to opt changes into CI, since those can quietly reduce coverage >> over time. CI should generally run by default for relevant changes, with >> explicit opt-outs where appropriate. >> >> Regarding the other repos, I pulled the github action usage data for the >> past 7 days: >> Repository Workflow runs Jobs Runner minutes % of total >> apache/iceberg 3,574 14,909 177,594.8 77.45% >> apache/iceberg-cpp 1,455 2,960 26,888.5 11.73% >> apache/iceberg-rust 1,078 3,416 18,196.7 7.94% >> apache/iceberg-python 594 1,445 3,387.4 1.48% >> apache/iceberg-go 633 1,188 3,154.1 1.38% >> apache/terraform-provider-iceberg 42 48 71.0 0.03% >> *Total* *7,376* *23,966* *229,292.5* *100.00%* >> >> Looks like java repo is still the top contributor :) >> >> Best, >> Kevin Liu >> >> On Thu, Jun 18, 2026 at 6:39 AM Ajantha Bhat <[email protected]> >> wrote: >> >>> Hi Manu, all of these were handled in the parent PR I mentioned three >>> weeks ago. >>> Can we all please review this? >>> https://github.com/apache/iceberg/pull/16566 >>> >>> I can split into smaller PRs if required. >>> >>> On Thu, Jun 18, 2026 at 1:59 PM Manu Zhang <[email protected]> >>> wrote: >>> >>>> Hi all, >>>> >>>> Here's another quick win from scoping Spark CI to only changed Spark >>>> versions [1]. We usually open a PR first against the latest Spark version >>>> and then back-port it to previous versions after the merge. Running Spark >>>> CI for all Spark versions in such cases wastes resources. >>>> >>>> If this approach is approved, I can also make a PR for Flink CI. >>>> >>>> >>>> 1. https://github.com/apache/iceberg/pull/16800 >>>> >>>> Thanks, >>>> Manu >>>> >>>> On Sat, Jun 13, 2026 at 8:34 AM Abnob Doss <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> A potential small win from the subproject side: the iceberg-rust >>>>> Python bindings CI had ended up building the Rust bindings twice per run, >>>>> due to an accidental interaction between a few changes over time. One-line >>>>> fix: >>>>> https://github.com/apache/iceberg-rust/pull/2636 >>>>> >>>>> Measured over the past 7 days, the duplicate build took a median of >>>>> 8.4 min on Linux, 12.1 min on macOS, and 15.3 min on Windows, totaling >>>>> about 2,400 runner-minutes across 207 job executions. After the fix the >>>>> same step takes a few seconds. >>>>> >>>>> Thanks, >>>>> Abanoub >>>>> >>>>> On Wednesday, June 3rd, 2026 at 9:49 AM, Bob Thomson < >>>>> [email protected]> wrote: >>>>> >>>>> > I don't think we have data to that level of granularity, it's a case >>>>> of looking at the Actions and their run time and frequency of execution in >>>>> each of your repos, and focussing on the longest running and most frequent >>>>> ones. That is, an Action run might only run for 5 minutes each time, but >>>>> if >>>>> it is running 400 times a day then that occupies more than one job slot of >>>>> the toal of 900 ASF has, for the duration of that day. >>>>> > Experience so far suggests those actions that build Java are often >>>>> the most time consuming. >>>>> > >>>>> > Thanks. >>>>> > >>>>> > Kind regards, >>>>> > -Bob Thomson. >>>>> > >>>>> > On 2026/06/01 18:39:38 Yufei Gu wrote: >>>>> > > Hi Bob, >>>>> > > >>>>> > > Thanks for the heads-up and for giving the Iceberg community time >>>>> to work >>>>> > > on this. >>>>> > > >>>>> > > One question: Is the concern based on the overall GitHub Actions >>>>> > > consumption of the Iceberg projects(e.g., main repo, python repo, >>>>> go repo, >>>>> > > etc), or only for the main Iceberg repository? Iceberg has multiple >>>>> > > repositories, including the main repository as well as Python, Go, >>>>> Rust, >>>>> > > and C++ subprojects. Most of the discussion and optimization work >>>>> in this >>>>> > > thread focuses on the main repository, where the majority of CI >>>>> usage >>>>> > > occurs. If the overall project usage is within acceptable limits, >>>>> would it >>>>> > > be possible to allow a higher quota for a single repo (the Iceberg >>>>> main >>>>> > > repository), given its broader compatibility and integration >>>>> testing >>>>> > > requirements? >>>>> > > >>>>> > > Yufei >>>>> > > >>>>> > > >>>>> > > On Mon, Jun 1, 2026 at 11:00 AM Steve Loughran < >>>>> [email protected]> wrote: >>>>> > > >>>>> > > > This is really good for draft builds. >>>>> > > > >>>>> > > > If I'm committing and pushing work up to a WiP PR, it is often >>>>> because I >>>>> > > > want *a* machine to do the testing; I don't care who it runs as. >>>>> > > > >>>>> > > > Forcing PRs to run as the submitter also hardens the OSS repo >>>>> against >>>>> > > > vulnerabilities in the Github Actions and other parts of the >>>>> build process. >>>>> > > > >>>>> > > > On Mon, 1 Jun 2026 at 17:11, Prashant Singh < >>>>> [email protected]> >>>>> > > > wrote: >>>>> > > > >>>>> > > >> Hi all, >>>>> > > >> >>>>> > > >> Great progress on the matrix reduction, incremental builds, >>>>> and draft PR >>>>> > > >> skipping ideas. I'd like to propose a complementary approach >>>>> that can >>>>> > > >> work >>>>> > > >> alongside all of those: running PR CI on contributor fork >>>>> compute >>>>> > > >> instead >>>>> > > >> of the ASF shared pool. >>>>> > > >> >>>>> > > >> How it works: >>>>> > > >> >>>>> > > >> Workflows switch from pull_request to push triggers on >>>>> non-main >>>>> > > >> branches. Each workflow: >>>>> > > >> >>>>> > > >> 1. Checks out apache/iceberg main (security boundary — >>>>> untrusted code >>>>> > > >> can't modify the workflow itself) >>>>> > > >> 2. Squash-merges the contributor's fork branch on top >>>>> > > >> 3. Runs tests on that merged tree >>>>> > > >> >>>>> > > >> Because the push event fires on the fork, GitHub bills the CI >>>>> minutes >>>>> > > >> to the fork owner's account - not the ASF shared pool. This >>>>> takes >>>>> > > >> Iceberg's PR CI usage from the ASF runners to effectively >>>>> zero, >>>>> > > >> regardless of matrix size. >>>>> > > >> >>>>> > > >> Why this is complementary: >>>>> > > >> >>>>> > > >> The optimizations discussed so far all reduce how much CI >>>>> runs. >>>>> > > >> Fork-compute changes where >>>>> > > >> it runs. They compose - a leaner matrix running on fork >>>>> compute is >>>>> > > >> strictly better than either approach alone. >>>>> > > >> >>>>> > > >> Inline PR status: >>>>> > > >> >>>>> > > >> A lightweight notify_test_workflow.yml (using >>>>> pull_request_target + >>>>> > > >> Checks API) is included to post fork CI results directly onto >>>>> the >>>>> > > >> upstream PR's checks tab - so reviewers see green/red status >>>>> inline as >>>>> > > >> they do today. >>>>> > > >> >>>>> > > >> *Prior art*: >>>>> > > >> >>>>> > > >> Apache Spark adopted this pattern in 2024 (SPARK-47041) and >>>>> has been >>>>> > > >> running it in production since. Their full Spark CI matrix >>>>> runs entirely >>>>> > > >> on contributor forks. >>>>> > > >> >>>>> > > >> PR: https://github.com/apache/iceberg/pull/15397: covers all >>>>> 10 >>>>> > > >> workflow files. I've verified all workflows pass on fork >>>>> computation. >>>>> > > >> >>>>> > > >> This could be merged independently of the matrix/incremental >>>>> > > >> optimizations and would immediately eliminate PR CI pressure >>>>> on the >>>>> > > >> ASF pool - well within the June 8 deadline. >>>>> > > >> >>>>> > > >> Thoughts? >>>>> > > >> >>>>> > > >> Prashant Singh >>>>> > > >> >>>>> > > >> On Fri, May 29, 2026 at 8:47 PM Renjie Liu < >>>>> [email protected]> >>>>> > > >> wrote: >>>>> > > >> >>>>> > > >>> I like the idea of cutting supported jvm runs in each ci. JVM >>>>> has great >>>>> > > >>> backward compatibility, and we run on one jvm (maybe jvm 17) >>>>> and trigger a >>>>> > > >>> nightly run for jvm 21. >>>>> > > >>> >>>>> > > >>> On Wed, May 27, 2026 at 3:17 AM Steve Loughran < >>>>> [email protected]> >>>>> > > >>> wrote: >>>>> > > >>> >>>>> > > >>>> >>>>> > > >>>> Doing a scan of the aws-sdk bundle.jar is halfway to an audit >>>>> of the >>>>> > > >>>> maven repo, with spark the other half. >>>>> > > >>>> >>>>> > > >>>> It seems to me that only PRs which go near >>>>> gradle/libs.versions.toml >>>>> > > >>>> are going to change dependences, so introduce new CVEs. >>>>> > > >>>> >>>>> > > >>>> There's the separate issue "CVEs are eternal" and all existing >>>>> > > >>>> dependencies are collections of undiscovered/unreported cves. >>>>> That's >>>>> > > >>>> dependabot's homework, generally. >>>>> > > >>>> >>>>> > > >>>> >>>>> > > >>>> On Tue, 26 May 2026 at 19:49, Kevin Liu < >>>>> [email protected]> wrote: >>>>> > > >>>> >>>>> > > >>>>> Thanks everyone for the great ideas. >>>>> > > >>>>> >>>>> > > >>>>> Here's where we stand today with respect to ASF runner usage >>>>> (taken >>>>> > > >>>>> from the link [2] above): >>>>> > > >>>>> GitHub Actions Build Time Used >>>>> > > >>>>> - past 7 days total usage: 218,321 minutes >>>>> > > >>>>> - past 5 days total usage: 120,241 minutes >>>>> > > >>>>> >>>>> > > >>>>> *This puts us below the hard ceiling for resource usage* as >>>>> described >>>>> > > >>>>> by https://infra.apache.org/github-actions-policy.html >>>>> > > >>>>> >>>>> > > >>>>> > The average number of minutes a project uses *per calendar >>>>> week >>>>> > > >>>>> MUST NOT exceed the equivalent of 25 full-time runners >>>>> (250,000 minutes, or >>>>> > > >>>>> 4,200 hours)*. >>>>> > > >>>>> > The average number of minutes a project uses *in any >>>>> consecutive >>>>> > > >>>>> five-day period MUST NOT exceed the equivalent of 30 >>>>> full-time runners >>>>> > > >>>>> (216,000 minutes, or 3,600 hours)*. >>>>> > > >>>>> >>>>> > > >>>>> We should still make improvements wherever possible. >>>>> > > >>>>> >>>>> > > >>>>> I have a few PRs to reduce CI usage further. >>>>> > > >>>>> - CI: Limit CVE scan runs to relevant changes #16513 >>>>> > > >>>>> - Build: Simplify CI workflow path filters to avoid >>>>> per-workflow >>>>> > > >>>>> maintenance #16302 >>>>> > > >>>>> >>>>> > > >>>>> There are a couple of heuristics we can use >>>>> > > >>>>> 1. Don't run CI if not needed. For example, `site/` dir >>>>> changes >>>>> > > >>>>> shouldn't trigger Spark/Flink/Java CI. This might be >>>>> optimized already, but >>>>> > > >>>>> we should double check just in case. >>>>> > > >>>>> 2. If we must run CI, fail fast. For example, if there is a >>>>> formatter >>>>> > > >>>>> issue, fail all inflight CI tasks. >>>>> > > >>>>> 3. Within a specific CI workflow, reduce the matrix wherever >>>>> possible. >>>>> > > >>>>> Do we really need to run all "Java versions" x "Scala >>>>> versions" x "Spark >>>>> > > >>>>> versions"? >>>>> > > >>>>> 4. Improve individual CI tasks. Spark CI dominates 57% of >>>>> all resource >>>>> > > >>>>> usage. I have a tracking issue where I benchmarked where all >>>>> that time is >>>>> > > >>>>> spent. See https://github.com/apache/iceberg/issues/16397 >>>>> > > >>>>> >>>>> > > >>>>> Top CI tasks as % of resource use: >>>>> > > >>>>> - Spark CI: 57.68% >>>>> > > >>>>> - Flink CI: 13.60% >>>>> > > >>>>> - Java CI: 7.02% >>>>> > > >>>>> - CVE Scan: 3.13% >>>>> > > >>>>> >>>>> > > >>>>> Best, >>>>> > > >>>>> Kevin Liu >>>>> > > >>>>> >>>>> > > >>>>> On Tue, May 26, 2026 at 5:35 AM Ajantha Bhat < >>>>> [email protected]> >>>>> > > >>>>> wrote: >>>>> > > >>>>> >>>>> > > >>>>>> Hi all, >>>>> > > >>>>>> >>>>> > > >>>>>> How about implementing the incremental PR builder? (similar >>>>> to >>>>> > > >>>>>> >>>>> https://github.com/gitflow-incremental-builder/gitflow-incremental-builder >>>>> > > >>>>>> ) >>>>> > > >>>>>> >>>>> > > >>>>>> I think one of the main causes of GitHub runner pressure in >>>>> Iceberg >>>>> > > >>>>>> is the breadth of our CI matrix. We support multiple >>>>> languages (java, >>>>> > > >>>>>> python, go, rust, cpp) and integrations, and for Java we >>>>> test across >>>>> > > >>>>>> multiple JVM versions, Spark versions, Flink versions, >>>>> Kafka, Hive/MR, >>>>> > > >>>>>> REST/OpenAPI, runtime bundles, and more. That coverage is >>>>> valuable, but >>>>> > > >>>>>> running most of it for every PR is expensive and increases >>>>> both runner >>>>> > > >>>>>> usage and CI wall time. >>>>> > > >>>>>> >>>>> > > >>>>>> I think the biggest win can be achieved by having an >>>>> incremental PR >>>>> > > >>>>>> build. >>>>> > > >>>>>> We already have useful building blocks for it: Gradle build >>>>> cache, >>>>> > > >>>>>> path filters, and version-selective build properties like >>>>> -DsparkVersions >>>>> > > >>>>>> and -DflinkVersions. >>>>> > > >>>>>> >>>>> > > >>>>>> The idea is to keep full coverage on main, release >>>>> branches, tags, >>>>> > > >>>>>> and global build changes, but make PR CI depend on the >>>>> files changed: >>>>> > > >>>>>> >>>>> > > >>>>>> - Spark-only changes run Spark CI, not Flink/Hive/Kafka. >>>>> > > >>>>>> - spark/v4.1/** changes run only Spark 4.1, not every >>>>> Spark >>>>> > > >>>>>> version. >>>>> > > >>>>>> - flink/v2.0/** changes run only Flink 2.0, not every >>>>> Flink >>>>> > > >>>>>> version. >>>>> > > >>>>>> - API/Core/Data/File format changes run the owning Java >>>>> checks >>>>> > > >>>>>> plus selected downstream canaries, such as latest Spark >>>>> and latest Flink, >>>>> > > >>>>>> instead of the full engine matrix. >>>>> > > >>>>>> - Runtime/bundle CVE checks run only for affected runtime >>>>> > > >>>>>> artifacts. >>>>> > > >>>>>> - A full-ci label or global Gradle/workflow changes can >>>>> still >>>>> > > >>>>>> force the full matrix. >>>>> > > >>>>>> >>>>> > > >>>>>> >>>>> > > >>>>>> Another possible optimization is JVM coverage. Today many >>>>> PR jobs run >>>>> > > >>>>>> across both Java 17 and Java 21. We could consider running >>>>> one primary JVM >>>>> > > >>>>>> for PRs, and reserve the full JVM matrix for main, release >>>>> branches, >>>>> > > >>>>>> nightly/scheduled builds, or PRs labeled full-ci. That >>>>> would further reduce >>>>> > > >>>>>> runner usage and PR wall time, while still preserving broad >>>>> compatibility >>>>> > > >>>>>> coverage before changes become part of the main branch. >>>>> > > >>>>>> >>>>> > > >>>>>> A practical approach could be: >>>>> > > >>>>>> >>>>> > > >>>>>> PRs: incremental module/version selection, mostly one JVM, >>>>> plus >>>>> > > >>>>>> targeted canaries. >>>>> > > >>>>>> main: full matrix across JVMs, Spark versions, Flink >>>>> versions, and >>>>> > > >>>>>> runtime checks. >>>>> > > >>>>>> Manual override: full-ci label for risky or cross-cutting >>>>> PRs. >>>>> > > >>>>>> >>>>> > > >>>>>> This should reduce queue time, lower GitHub runner >>>>> consumption, and >>>>> > > >>>>>> give contributors faster feedback without giving up full >>>>> coverage where it >>>>> > > >>>>>> matters most. >>>>> > > >>>>>> >>>>> > > >>>>>> I am working on a POC >>>>> https://github.com/apache/iceberg/pull/16566 >>>>> > > >>>>>> Suggestions are welcome. >>>>> > > >>>>>> >>>>> > > >>>>>> - Ajantha >>>>> > > >>>>>> >>>>> > > >>>>>> On Mon, May 25, 2026 at 7:35 PM Junwang Zhao < >>>>> [email protected]> >>>>> > > >>>>>> wrote: >>>>> > > >>>>>> >>>>> > > >>>>>>> Hi Manu, >>>>> > > >>>>>>> >>>>> > > >>>>>>> On Mon, May 25, 2026 at 9:33 PM Manu Zhang < >>>>> [email protected]> >>>>> > > >>>>>>> wrote: >>>>> > > >>>>>>> > >>>>> > > >>>>>>> > Hi Junwang, >>>>> > > >>>>>>> > >>>>> > > >>>>>>> > Not sure about others but I usually only change status >>>>> to "Ready >>>>> > > >>>>>>> for review" when CI has passed. >>>>> > > >>>>>>> >>>>> > > >>>>>>> Yeah, I agree there are trade-offs to disabling gh actions >>>>> for draft >>>>> > > >>>>>>> PRs. >>>>> > > >>>>>>> >>>>> > > >>>>>>> Reasons to Disable: >>>>> > > >>>>>>> >>>>> > > >>>>>>> - Cost savings: large teams and monorepos can burn through >>>>> GitHub >>>>> > > >>>>>>> Actions minutes quickly. Skipping CI for draft PRs avoids >>>>> spending >>>>> > > >>>>>>> resources on code that may not even compile yet. >>>>> > > >>>>>>> - Reduced noise: draft PRs are often used for >>>>> experimentation or >>>>> > > >>>>>>> work-in-progress changes. Disabling CI avoids cluttering >>>>> the PR >>>>> > > >>>>>>> timeline with transient failures while the author is still >>>>> iterating. >>>>> > > >>>>>>> - Better resource utilization: orgs with limited >>>>> self-hosted runners >>>>> > > >>>>>>> may prefer to prioritize "Ready for Review" PRs so >>>>> > > >>>>>>> production-relevant >>>>> > > >>>>>>> changes get feedback and merge capacity sooner. >>>>> > > >>>>>>> >>>>> > > >>>>>>> Reasons to Keep: >>>>> > > >>>>>>> >>>>> > > >>>>>>> - Early error detection: developers can use draft PRs as a >>>>> sandbox to >>>>> > > >>>>>>> validate builds and tests before requesting review. >>>>> > > >>>>>>> - Self-correction: failed checks on a draft PR allow >>>>> authors to fix >>>>> > > >>>>>>> lint or test issues before involving reviewers. >>>>> > > >>>>>>> - Higher review confidence: by the time a PR is marked >>>>> "Ready for >>>>> > > >>>>>>> Review", CI has often already passed at least once, >>>>> leading to a >>>>> > > >>>>>>> smoother review process. >>>>> > > >>>>>>> >>>>> > > >>>>>>> For myself, when I create a draft PR, I'm usually sharing >>>>> early >>>>> > > >>>>>>> work-in-progress code with other developers and may not >>>>> have tested >>>>> > > >>>>>>> it >>>>> > > >>>>>>> thoroughly locally yet, so I sometimes prefer to disable >>>>> CI. That's >>>>> > > >>>>>>> just my personal preference though. >>>>> > > >>>>>>> >>>>> > > >>>>>>> > >>>>> > > >>>>>>> > Regards, >>>>> > > >>>>>>> > Manu >>>>> > > >>>>>>> > >>>>> > > >>>>>>> > On Mon, May 25, 2026 at 3:21 PM Junwang Zhao < >>>>> [email protected]> >>>>> > > >>>>>>> wrote: >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> On Mon, May 25, 2026 at 11:20 AM Junwang Zhao < >>>>> [email protected]> >>>>> > > >>>>>>> wrote: >>>>> > > >>>>>>> >> > >>>>> > > >>>>>>> >> > On Sun, May 24, 2026 at 12:13 PM Steven Wu < >>>>> > > >>>>>>> [email protected]> wrote: >>>>> > > >>>>>>> >> > > >>>>> > > >>>>>>> >> > > Kevin's PR of removing Spark 3.4 was merged a few >>>>> days ago. >>>>> > > >>>>>>> It should reduce the Spark CI cost by ~25%. >>>>> > > >>>>>>> >> > > >>>>> > > >>>>>>> >> > > Some heavy-hitter test classes in Spark tests (core >>>>> and >>>>> > > >>>>>>> extension) cause high load due to parameter combinations. >>>>> I asked AI to >>>>> > > >>>>>>> analyze the build log and recommend changes offering the >>>>> best ROI. Details >>>>> > > >>>>>>> are in this doc. >>>>> > > >>>>>>> >> > > >>>>> > > >>>>>>> >> > > I can look into dropping some combinations without >>>>> > > >>>>>>> sacrificing essential coverage. E.g., we can probably drop >>>>> the Hadoop >>>>> > > >>>>>>> catalog usage in test, as it wasn't recommended for >>>>> production use anyway. >>>>> > > >>>>>>> >> > >>>>> > > >>>>>>> >> > iceberg-cpp skips Actions for draft PRs [1] to reduce >>>>> CI >>>>> > > >>>>>>> resource >>>>> > > >>>>>>> >> > usage a little bit. Perhaps we should apply the same >>>>> approach >>>>> > > >>>>>>> across >>>>> > > >>>>>>> >> > all iceberg subprojects? >>>>> > > >>>>>>> >> > >>>>> > > >>>>>>> >> > [1] https://github.com/apache/iceberg-cpp/pull/680 >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> I've created a PR to show that, see [1], since it's a >>>>> draft, the >>>>> > > >>>>>>> CI >>>>> > > >>>>>>> >> won't run. If I click the `Ready for review` button, >>>>> the actions >>>>> > > >>>>>>> will >>>>> > > >>>>>>> >> be triggered. Let me know what you think about it. >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> [1] https://github.com/apache/iceberg/pull/16561 >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> > >>>>> > > >>>>>>> >> > > >>>>> > > >>>>>>> >> > > >>>>> > > >>>>>>> >> > > >>>>> > > >>>>>>> >> > > On Fri, May 22, 2026 at 8:22 AM Matt Butrovich < >>>>> > > >>>>>>> [email protected]> wrote: >>>>> > > >>>>>>> >> > >> >>>>> > > >>>>>>> >> > >> Apache DataFusion similarly received this notice. >>>>> For >>>>> > > >>>>>>> visibility to the Iceberg community, we have tracking >>>>> issues to try to >>>>> > > >>>>>>> discuss solutions: >>>>> > > >>>>>>> >> > >> >>>>> > > >>>>>>> >> > >> https://github.com/apache/datafusion/issues/22455 >>>>> > > >>>>>>> >> > >> >>>>> https://github.com/apache/datafusion-comet/issues/4406 >>>>> > > >>>>>>> >> > >> >>>>> > > >>>>>>> >> > >> DataFusion Comet is consuming the vast majority of >>>>> > > >>>>>>> DataFusion resources, and like the Iceberg project it's >>>>> due to Spark tests >>>>> > > >>>>>>> (and Iceberg's Spark tests). We are doing some analysis on >>>>> what subsets >>>>> > > >>>>>>> might be appropriate for our workflows, features, and >>>>> goals, and will share >>>>> > > >>>>>>> anything that we think might translate back to the Iceberg >>>>> CI workflows. >>>>> > > >>>>>>> >> > >> >>>>> > > >>>>>>> >> > >> On Fri, May 22, 2026 at 7:43 AM Robert Thomson < >>>>> > > >>>>>>> [email protected]> wrote: >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> Hello, Iceberg PMC. >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> In 2024, the ASF introduced the policy for GitHub >>>>> Actions >>>>> > > >>>>>>> usage >>>>> > > >>>>>>> >> > >>> across the foundation[1]. The ASF Github shared >>>>> pool of >>>>> > > >>>>>>> >> > >>> Github-hosted runners has been at, or very close >>>>> to the >>>>> > > >>>>>>> limit of >>>>> > > >>>>>>> >> > >>> 900 jobs most of the time in the past few weeks >>>>> and this is >>>>> > > >>>>>>> the >>>>> > > >>>>>>> >> > >>> case again today. >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> Your project has been identified as being among >>>>> the top 5 >>>>> > > >>>>>>> consumers of >>>>> > > >>>>>>> >> > >>> build time over the past 7 days and we request >>>>> that you >>>>> > > >>>>>>> bring your >>>>> > > >>>>>>> >> > >>> usage down by stream-lining long-running builds. >>>>> Contact >>>>> > > >>>>>>> Infra for >>>>> > > >>>>>>> >> > >>> a consultation if you are unable to streamline >>>>> your builds >>>>> > > >>>>>>> further. >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> You can use the infra reporting tool[2] to >>>>> monitor your GHA >>>>> > > >>>>>>> usage as you >>>>> > > >>>>>>> >> > >>> work on stream-lining, as well as locate any >>>>> bottlenecks in >>>>> > > >>>>>>> the workflows. >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> Infra will allow you two weeks time (till the 8th >>>>> of June, >>>>> > > >>>>>>> 2026) to >>>>> > > >>>>>>> >> > >>> progress this, but should you still be above the >>>>> limits by >>>>> > > >>>>>>> then, >>>>> > > >>>>>>> >> > >>> without a viable path forward, we will be >>>>> limiting your GHA >>>>> > > >>>>>>> usage. >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> Kind regards, >>>>> > > >>>>>>> >> > >>> Bob Thomson, on behalf of ASF Infrastructure. >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>> [1] >>>>> https://infra.apache.org/github-actions-policy.html >>>>> > > >>>>>>> >> > >>> [2] >>>>> > > >>>>>>> >>>>> https://infra-reports.apache.org/#ghactions&project=iceberg&hours=24&limit=15&group=name >>>>> > > >>>>>>> >> > >>> >>>>> > > >>>>>>> >> > >>>>> > > >>>>>>> >> > >>>>> > > >>>>>>> >> > -- >>>>> > > >>>>>>> >> > Regards >>>>> > > >>>>>>> >> > Junwang Zhao >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> >>>>> > > >>>>>>> >> -- >>>>> > > >>>>>>> >> Regards >>>>> > > >>>>>>> >> Junwang Zhao >>>>> > > >>>>>>> >>>>> > > >>>>>>> >>>>> > > >>>>>>> >>>>> > > >>>>>>> -- >>>>> > > >>>>>>> Regards >>>>> > > >>>>>>> Junwang Zhao >>>>> > > >>>>>>> >>>>> > > >>>>>> >>>>> > > >>>>> > >>>>> >>>>
