Re: Iceberg Consumption of ASF Shared GitHub-hosted Runners

Ajantha Bhat Tue, 30 Jun 2026 03:04:39 -0700

As a first step, we are merging this PR.
We have approvals for https://github.com/apache/iceberg/pull/16945 CI: Use
one Java version for PR checks


Let us know if you have any comments for this.

On Wed, Jun 24, 2026 at 2:44 PM Ajantha Bhat <[email protected]> wrote:

> Hi all,
> I have now created multiple small PRs for the easy and big wins.
>
>    - https://github.com/apache/iceberg/pull/16945 CI: Use one Java
>    version for PR checks
>    - https://github.com/apache/iceberg/pull/16946 CI: Select Spark PR
>    matrix by changed version
>    - https://github.com/apache/iceberg/pull/16947 CI: Select Flink PR
>    matrix by changed version
>
> Please take a look. I will have two or three more follow up PRs after this
> to handle full ci flag and other module incremental builds from the
> original PR: https://github.com/apache/iceberg/pull/16566
>
> - Ajantha
>
> On Fri, Jun 19, 2026 at 9:12 PM Manu Zhang <[email protected]>
> wrote:
>
>> Ajantha, sorry I missed your early email. It will be great to split your
>> PR and get the enhancements for Spark CI or Flink CI in first.
>>
>> Kevin, that's good news!
>>
>>> CI should generally run by default for relevant changes, with explicit
>>> opt-outs where appropriate.
>>
>> Agreed. I believe there are still low hanging fruits we can pick based on
>> Ajantha and others' PRs.
>>
>> Thanks,
>> Manu
>>
>> On Fri, Jun 19, 2026 at 2:17 AM Kevin Liu <[email protected]> wrote:
>>
>>> Thanks everyone for all the contributions to reduce CI resource usage.
>>> I've seen a number of improvements go in already. I just checked the
>>> apache dashboard, it looks like we're still under the ceiling set by ASF,
>>> for both 5 day and 7 day periods.
>>>
>>> There's definitely more room for improvement. But I think we should
>>> prioritize correctness and coverage. I would also like to focus on
>>> maintainability and avoid patterns that require ongoing manual
>>> maintenance to opt changes into CI, since those can quietly reduce coverage
>>> over time. CI should generally run by default for relevant changes,
>>> with explicit opt-outs where appropriate.
>>>
>>> Regarding the other repos, I pulled the github action usage data for
>>> the past 7 days:
>>> Repository Workflow runs Jobs Runner minutes % of total
>>> apache/iceberg 3,574 14,909 177,594.8 77.45%
>>> apache/iceberg-cpp 1,455 2,960 26,888.5 11.73%
>>> apache/iceberg-rust 1,078 3,416 18,196.7 7.94%
>>> apache/iceberg-python 594 1,445 3,387.4 1.48%
>>> apache/iceberg-go 633 1,188 3,154.1 1.38%
>>> apache/terraform-provider-iceberg 42 48 71.0 0.03%
>>> *Total* *7,376* *23,966* *229,292.5* *100.00%*
>>>
>>> Looks like java repo is still the top contributor :)
>>>
>>> Best,
>>> Kevin Liu
>>>
>>> On Thu, Jun 18, 2026 at 6:39 AM Ajantha Bhat <[email protected]>
>>> wrote:
>>>
>>>> Hi Manu, all of these were handled in the parent PR I mentioned three
>>>> weeks ago.
>>>> Can we all please review this?
>>>> https://github.com/apache/iceberg/pull/16566
>>>>
>>>> I can split into smaller PRs if required.
>>>>
>>>> On Thu, Jun 18, 2026 at 1:59 PM Manu Zhang <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Here's another quick win from scoping Spark CI to only changed Spark
>>>>> versions [1]. We usually open a PR first against the latest Spark version
>>>>> and then back-port it to previous versions after the merge. Running Spark
>>>>> CI for all Spark versions in such cases wastes resources.
>>>>>
>>>>> If this approach is approved, I can also make a PR for Flink CI.
>>>>>
>>>>>
>>>>> 1. https://github.com/apache/iceberg/pull/16800
>>>>>
>>>>> Thanks,
>>>>> Manu
>>>>>
>>>>> On Sat, Jun 13, 2026 at 8:34 AM Abnob Doss <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> A potential small win from the subproject side: the iceberg-rust
>>>>>> Python bindings CI had ended up building the Rust bindings twice per run,
>>>>>> due to an accidental interaction between a few changes over time. 
>>>>>> One-line
>>>>>> fix:
>>>>>> https://github.com/apache/iceberg-rust/pull/2636
>>>>>>
>>>>>> Measured over the past 7 days, the duplicate build took a median of
>>>>>> 8.4 min on Linux, 12.1 min on macOS, and 15.3 min on Windows, totaling
>>>>>> about 2,400 runner-minutes across 207 job executions. After the fix the
>>>>>> same step takes a few seconds.
>>>>>>
>>>>>> Thanks,
>>>>>> Abanoub
>>>>>>
>>>>>> On Wednesday, June 3rd, 2026 at 9:49 AM, Bob Thomson <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>> > I don't think we have data to that level of granularity, it's a
>>>>>> case of looking at the Actions and their run time and frequency of
>>>>>> execution in each of your repos, and focussing on the longest running and
>>>>>> most frequent ones. That is, an Action run might only run for 5 minutes
>>>>>> each time, but if it is running 400 times a day then that occupies more
>>>>>> than one job slot of the toal of 900 ASF has, for the duration of that 
>>>>>> day.
>>>>>> > Experience so far suggests those actions that build Java are often
>>>>>> the most time consuming.
>>>>>> >
>>>>>> > Thanks.
>>>>>> >
>>>>>> > Kind regards,
>>>>>> > -Bob Thomson.
>>>>>> >
>>>>>> > On 2026/06/01 18:39:38 Yufei Gu wrote:
>>>>>> > > Hi Bob,
>>>>>> > >
>>>>>> > > Thanks for the heads-up and for giving the Iceberg community time
>>>>>> to work
>>>>>> > > on this.
>>>>>> > >
>>>>>> > > One question: Is the concern based on the overall GitHub Actions
>>>>>> > > consumption of the Iceberg projects(e.g., main repo, python repo,
>>>>>> go repo,
>>>>>> > > etc), or only for the main Iceberg repository? Iceberg has
>>>>>> multiple
>>>>>> > > repositories, including the main repository as well as Python,
>>>>>> Go, Rust,
>>>>>> > > and C++ subprojects. Most of the discussion and optimization work
>>>>>> in this
>>>>>> > > thread focuses on the main repository, where the majority of CI
>>>>>> usage
>>>>>> > > occurs. If the overall project usage is within acceptable limits,
>>>>>> would it
>>>>>> > > be possible to allow a higher quota for a single repo (the
>>>>>> Iceberg main
>>>>>> > > repository), given its broader compatibility and integration
>>>>>> testing
>>>>>> > > requirements?
>>>>>> > >
>>>>>> > > Yufei
>>>>>> > >
>>>>>> > >
>>>>>> > > On Mon, Jun 1, 2026 at 11:00 AM Steve Loughran <
>>>>>> [email protected]> wrote:
>>>>>> > >
>>>>>> > > > This is really good for draft builds.
>>>>>> > > >
>>>>>> > > > If I'm committing and pushing work up to a WiP PR, it is often
>>>>>> because I
>>>>>> > > > want *a* machine to do the testing; I don't care who it runs as.
>>>>>> > > >
>>>>>> > > > Forcing PRs to run as the submitter also hardens the OSS repo
>>>>>> against
>>>>>> > > > vulnerabilities in the Github Actions and other parts of the
>>>>>> build process.
>>>>>> > > >
>>>>>> > > > On Mon, 1 Jun 2026 at 17:11, Prashant Singh <
>>>>>> [email protected]>
>>>>>> > > > wrote:
>>>>>> > > >
>>>>>> > > >>   Hi all,
>>>>>> > > >>
>>>>>> > > >>   Great progress on the matrix reduction, incremental builds,
>>>>>> and draft PR
>>>>>> > > >>   skipping ideas. I'd like to propose a complementary approach
>>>>>> that can
>>>>>> > > >> work
>>>>>> > > >>   alongside all of those: running PR CI on contributor fork
>>>>>> compute
>>>>>> > > >> instead
>>>>>> > > >>   of the ASF shared pool.
>>>>>> > > >>
>>>>>> > > >>   How it works:
>>>>>> > > >>
>>>>>> > > >>   Workflows switch from pull_request to push triggers on
>>>>>> non-main
>>>>>> > > >>   branches. Each workflow:
>>>>>> > > >>
>>>>>> > > >>   1. Checks out apache/iceberg main (security boundary —
>>>>>> untrusted code
>>>>>> > > >>   can't modify the workflow itself)
>>>>>> > > >>   2. Squash-merges the contributor's fork branch on top
>>>>>> > > >>   3. Runs tests on that merged tree
>>>>>> > > >>
>>>>>> > > >>   Because the push event fires on the fork, GitHub bills the
>>>>>> CI minutes
>>>>>> > > >>   to the fork owner's account - not the ASF shared pool. This
>>>>>> takes
>>>>>> > > >>   Iceberg's PR CI usage from the ASF runners to effectively
>>>>>> zero,
>>>>>> > > >>   regardless of matrix size.
>>>>>> > > >>
>>>>>> > > >>   Why this is complementary:
>>>>>> > > >>
>>>>>> > > >>   The optimizations discussed so far all reduce how much CI
>>>>>> runs.
>>>>>> > > >> Fork-compute changes where
>>>>>> > > >>   it runs. They compose - a leaner matrix running on fork
>>>>>> compute is
>>>>>> > > >>   strictly better than either approach alone.
>>>>>> > > >>
>>>>>> > > >>   Inline PR status:
>>>>>> > > >>
>>>>>> > > >>   A lightweight notify_test_workflow.yml (using
>>>>>> pull_request_target +
>>>>>> > > >>   Checks API) is included to post fork CI results directly
>>>>>> onto the
>>>>>> > > >>   upstream PR's checks tab - so reviewers see green/red status
>>>>>> inline as
>>>>>> > > >>   they do today.
>>>>>> > > >>
>>>>>> > > >>   *Prior art*:
>>>>>> > > >>
>>>>>> > > >>   Apache Spark adopted this pattern in 2024 (SPARK-47041) and
>>>>>> has been
>>>>>> > > >>   running it in production since. Their full Spark CI matrix
>>>>>> runs entirely
>>>>>> > > >>   on contributor forks.
>>>>>> > > >>
>>>>>> > > >>   PR: https://github.com/apache/iceberg/pull/15397: covers
>>>>>> all 10
>>>>>> > > >>   workflow files. I've verified all workflows pass on fork
>>>>>> computation.
>>>>>> > > >>
>>>>>> > > >>   This could be merged independently of the matrix/incremental
>>>>>> > > >>   optimizations and would immediately eliminate PR CI pressure
>>>>>> on the
>>>>>> > > >>   ASF pool - well within the June 8 deadline.
>>>>>> > > >>
>>>>>> > > >>   Thoughts?
>>>>>> > > >>
>>>>>> > > >> Prashant Singh
>>>>>> > > >>
>>>>>> > > >> On Fri, May 29, 2026 at 8:47 PM Renjie Liu <
>>>>>> [email protected]>
>>>>>> > > >> wrote:
>>>>>> > > >>
>>>>>> > > >>> I like the idea of cutting supported jvm runs in each ci. JVM
>>>>>> has great
>>>>>> > > >>> backward compatibility, and we run on one jvm (maybe jvm 17)
>>>>>> and trigger a
>>>>>> > > >>> nightly run for jvm 21.
>>>>>> > > >>>
>>>>>> > > >>> On Wed, May 27, 2026 at 3:17 AM Steve Loughran <
>>>>>> [email protected]>
>>>>>> > > >>> wrote:
>>>>>> > > >>>
>>>>>> > > >>>>
>>>>>> > > >>>> Doing a scan of the aws-sdk bundle.jar is halfway to an
>>>>>> audit of the
>>>>>> > > >>>> maven repo, with spark the other half.
>>>>>> > > >>>>
>>>>>> > > >>>> It seems to me that only PRs which go near
>>>>>> gradle/libs.versions.toml
>>>>>> > > >>>> are going to change dependences, so introduce new CVEs.
>>>>>> > > >>>>
>>>>>> > > >>>> There's the separate issue "CVEs are eternal" and all
>>>>>> existing
>>>>>> > > >>>> dependencies are collections of undiscovered/unreported
>>>>>> cves. That's
>>>>>> > > >>>> dependabot's homework, generally.
>>>>>> > > >>>>
>>>>>> > > >>>>
>>>>>> > > >>>> On Tue, 26 May 2026 at 19:49, Kevin Liu <
>>>>>> [email protected]> wrote:
>>>>>> > > >>>>
>>>>>> > > >>>>> Thanks everyone for the great ideas.
>>>>>> > > >>>>>
>>>>>> > > >>>>> Here's where we stand today with respect to ASF runner
>>>>>> usage (taken
>>>>>> > > >>>>> from the link [2] above):
>>>>>> > > >>>>> GitHub Actions Build Time Used
>>>>>> > > >>>>> - past 7 days total usage: 218,321 minutes
>>>>>> > > >>>>> - past 5 days total usage: 120,241 minutes
>>>>>> > > >>>>>
>>>>>> > > >>>>> *This puts us below the hard ceiling for resource usage* as
>>>>>> described
>>>>>> > > >>>>> by https://infra.apache.org/github-actions-policy.html
>>>>>> > > >>>>>
>>>>>> > > >>>>> > The average number of minutes a project uses *per
>>>>>> calendar week
>>>>>> > > >>>>> MUST NOT exceed the equivalent of 25 full-time runners
>>>>>> (250,000 minutes, or
>>>>>> > > >>>>> 4,200 hours)*.
>>>>>> > > >>>>> > The average number of minutes a project uses *in any
>>>>>> consecutive
>>>>>> > > >>>>> five-day period MUST NOT exceed the equivalent of 30
>>>>>> full-time runners
>>>>>> > > >>>>> (216,000 minutes, or 3,600 hours)*.
>>>>>> > > >>>>>
>>>>>> > > >>>>> We should still make improvements wherever possible.
>>>>>> > > >>>>>
>>>>>> > > >>>>> I have a few PRs to reduce CI usage further.
>>>>>> > > >>>>> - CI: Limit CVE scan runs to relevant changes #16513
>>>>>> > > >>>>> - Build: Simplify CI workflow path filters to avoid
>>>>>> per-workflow
>>>>>> > > >>>>> maintenance #16302
>>>>>> > > >>>>>
>>>>>> > > >>>>> There are a couple of heuristics we can use
>>>>>> > > >>>>> 1. Don't run CI if not needed. For example, `site/` dir
>>>>>> changes
>>>>>> > > >>>>> shouldn't trigger Spark/Flink/Java CI. This might be
>>>>>> optimized already, but
>>>>>> > > >>>>> we should double check just in case.
>>>>>> > > >>>>> 2. If we must run CI, fail fast. For example, if there is a
>>>>>> formatter
>>>>>> > > >>>>> issue, fail all inflight CI tasks.
>>>>>> > > >>>>> 3. Within a specific CI workflow, reduce the matrix
>>>>>> wherever possible.
>>>>>> > > >>>>> Do we really need to run all "Java versions" x "Scala
>>>>>> versions" x "Spark
>>>>>> > > >>>>> versions"?
>>>>>> > > >>>>> 4. Improve individual CI tasks. Spark CI dominates 57% of
>>>>>> all resource
>>>>>> > > >>>>> usage. I have a tracking issue where I benchmarked where
>>>>>> all that time is
>>>>>> > > >>>>> spent. See https://github.com/apache/iceberg/issues/16397
>>>>>> > > >>>>>
>>>>>> > > >>>>> Top CI tasks as % of resource use:
>>>>>> > > >>>>> - Spark CI: 57.68%
>>>>>> > > >>>>> - Flink CI: 13.60%
>>>>>> > > >>>>> - Java CI: 7.02%
>>>>>> > > >>>>> - CVE Scan: 3.13%
>>>>>> > > >>>>>
>>>>>> > > >>>>> Best,
>>>>>> > > >>>>> Kevin Liu
>>>>>> > > >>>>>
>>>>>> > > >>>>> On Tue, May 26, 2026 at 5:35 AM Ajantha Bhat <
>>>>>> [email protected]>
>>>>>> > > >>>>> wrote:
>>>>>> > > >>>>>
>>>>>> > > >>>>>> Hi all,
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> How about implementing the incremental PR builder?
>>>>>> (similar to
>>>>>> > > >>>>>>
>>>>>> https://github.com/gitflow-incremental-builder/gitflow-incremental-builder
>>>>>> > > >>>>>> )
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> I think one of the main causes of GitHub runner pressure
>>>>>> in Iceberg
>>>>>> > > >>>>>> is the breadth of our CI matrix. We support multiple
>>>>>> languages (java,
>>>>>> > > >>>>>> python, go, rust, cpp) and integrations, and for Java we
>>>>>> test across
>>>>>> > > >>>>>> multiple JVM versions, Spark versions, Flink versions,
>>>>>> Kafka, Hive/MR,
>>>>>> > > >>>>>> REST/OpenAPI, runtime bundles, and more. That coverage is
>>>>>> valuable, but
>>>>>> > > >>>>>> running most of it for every PR is expensive and increases
>>>>>> both runner
>>>>>> > > >>>>>> usage and CI wall time.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> I think the biggest win can be achieved by having an
>>>>>> incremental PR
>>>>>> > > >>>>>> build.
>>>>>> > > >>>>>> We already have useful building blocks for it: Gradle
>>>>>> build cache,
>>>>>> > > >>>>>> path filters, and version-selective build properties like
>>>>>> -DsparkVersions
>>>>>> > > >>>>>> and -DflinkVersions.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> The idea is to keep full coverage on main, release
>>>>>> branches, tags,
>>>>>> > > >>>>>> and global build changes, but make PR CI depend on the
>>>>>> files changed:
>>>>>> > > >>>>>>
>>>>>> > > >>>>>>    - Spark-only changes run Spark CI, not Flink/Hive/Kafka.
>>>>>> > > >>>>>>    - spark/v4.1/** changes run only Spark 4.1, not every
>>>>>> Spark
>>>>>> > > >>>>>>    version.
>>>>>> > > >>>>>>    - flink/v2.0/** changes run only Flink 2.0, not every
>>>>>> Flink
>>>>>> > > >>>>>>    version.
>>>>>> > > >>>>>>    - API/Core/Data/File format changes run the owning Java
>>>>>> checks
>>>>>> > > >>>>>>    plus selected downstream canaries, such as latest Spark
>>>>>> and latest Flink,
>>>>>> > > >>>>>>    instead of the full engine matrix.
>>>>>> > > >>>>>>    - Runtime/bundle CVE checks run only for affected
>>>>>> runtime
>>>>>> > > >>>>>>    artifacts.
>>>>>> > > >>>>>>    - A full-ci label or global Gradle/workflow changes can
>>>>>> still
>>>>>> > > >>>>>>    force the full matrix.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> Another possible optimization is JVM coverage. Today many
>>>>>> PR jobs run
>>>>>> > > >>>>>> across both Java 17 and Java 21. We could consider running
>>>>>> one primary JVM
>>>>>> > > >>>>>> for PRs, and reserve the full JVM matrix for main, release
>>>>>> branches,
>>>>>> > > >>>>>> nightly/scheduled builds, or PRs labeled full-ci. That
>>>>>> would further reduce
>>>>>> > > >>>>>> runner usage and PR wall time, while still preserving
>>>>>> broad compatibility
>>>>>> > > >>>>>> coverage before changes become part of the main branch.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> A practical approach could be:
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> PRs: incremental module/version selection, mostly one JVM,
>>>>>> plus
>>>>>> > > >>>>>> targeted canaries.
>>>>>> > > >>>>>> main: full matrix across JVMs, Spark versions, Flink
>>>>>> versions, and
>>>>>> > > >>>>>> runtime checks.
>>>>>> > > >>>>>> Manual override: full-ci label for risky or cross-cutting
>>>>>> PRs.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> This should reduce queue time, lower GitHub runner
>>>>>> consumption, and
>>>>>> > > >>>>>> give contributors faster feedback without giving up full
>>>>>> coverage where it
>>>>>> > > >>>>>> matters most.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> I am working on a POC
>>>>>> https://github.com/apache/iceberg/pull/16566
>>>>>> > > >>>>>> Suggestions are welcome.
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> - Ajantha
>>>>>> > > >>>>>>
>>>>>> > > >>>>>> On Mon, May 25, 2026 at 7:35 PM Junwang Zhao <
>>>>>> [email protected]>
>>>>>> > > >>>>>> wrote:
>>>>>> > > >>>>>>
>>>>>> > > >>>>>>> Hi Manu,
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> On Mon, May 25, 2026 at 9:33 PM Manu Zhang <
>>>>>> [email protected]>
>>>>>> > > >>>>>>> wrote:
>>>>>> > > >>>>>>> >
>>>>>> > > >>>>>>> > Hi Junwang,
>>>>>> > > >>>>>>> >
>>>>>> > > >>>>>>> > Not sure about others but I usually only change status
>>>>>> to "Ready
>>>>>> > > >>>>>>> for review"  when CI has passed.
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> Yeah, I agree there are trade-offs to disabling gh
>>>>>> actions for draft
>>>>>> > > >>>>>>> PRs.
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> Reasons to Disable:
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> - Cost savings: large teams and monorepos can burn
>>>>>> through GitHub
>>>>>> > > >>>>>>> Actions minutes quickly. Skipping CI for draft PRs avoids
>>>>>> spending
>>>>>> > > >>>>>>> resources on code that may not even compile yet.
>>>>>> > > >>>>>>> - Reduced noise: draft PRs are often used for
>>>>>> experimentation or
>>>>>> > > >>>>>>> work-in-progress changes. Disabling CI avoids cluttering
>>>>>> the PR
>>>>>> > > >>>>>>> timeline with transient failures while the author is
>>>>>> still iterating.
>>>>>> > > >>>>>>> - Better resource utilization: orgs with limited
>>>>>> self-hosted runners
>>>>>> > > >>>>>>> may prefer to prioritize "Ready for Review" PRs so
>>>>>> > > >>>>>>> production-relevant
>>>>>> > > >>>>>>> changes get feedback and merge capacity sooner.
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> Reasons to Keep:
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> - Early error detection: developers can use draft PRs as
>>>>>> a sandbox to
>>>>>> > > >>>>>>> validate builds and tests before requesting review.
>>>>>> > > >>>>>>> - Self-correction: failed checks on a draft PR allow
>>>>>> authors to fix
>>>>>> > > >>>>>>> lint or test issues before involving reviewers.
>>>>>> > > >>>>>>> - Higher review confidence: by the time a PR is marked
>>>>>> "Ready for
>>>>>> > > >>>>>>> Review", CI has often already passed at least once,
>>>>>> leading to a
>>>>>> > > >>>>>>> smoother review process.
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> For myself, when I create a draft PR, I'm usually sharing
>>>>>> early
>>>>>> > > >>>>>>> work-in-progress code with other developers and may not
>>>>>> have tested
>>>>>> > > >>>>>>> it
>>>>>> > > >>>>>>> thoroughly locally yet, so I sometimes prefer to disable
>>>>>> CI. That's
>>>>>> > > >>>>>>> just my personal preference though.
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> >
>>>>>> > > >>>>>>> > Regards,
>>>>>> > > >>>>>>> > Manu
>>>>>> > > >>>>>>> >
>>>>>> > > >>>>>>> > On Mon, May 25, 2026 at 3:21 PM Junwang Zhao <
>>>>>> [email protected]>
>>>>>> > > >>>>>>> wrote:
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >> On Mon, May 25, 2026 at 11:20 AM Junwang Zhao <
>>>>>> [email protected]>
>>>>>> > > >>>>>>> wrote:
>>>>>> > > >>>>>>> >> >
>>>>>> > > >>>>>>> >> > On Sun, May 24, 2026 at 12:13 PM Steven Wu <
>>>>>> > > >>>>>>> [email protected]> wrote:
>>>>>> > > >>>>>>> >> > >
>>>>>> > > >>>>>>> >> > > Kevin's PR of removing Spark 3.4 was merged a few
>>>>>> days ago.
>>>>>> > > >>>>>>> It should reduce the Spark CI cost by ~25%.
>>>>>> > > >>>>>>> >> > >
>>>>>> > > >>>>>>> >> > > Some heavy-hitter test classes in Spark tests
>>>>>> (core and
>>>>>> > > >>>>>>> extension) cause high load due to parameter combinations.
>>>>>> I asked AI to
>>>>>> > > >>>>>>> analyze the build log and recommend changes offering the
>>>>>> best ROI. Details
>>>>>> > > >>>>>>> are in this doc.
>>>>>> > > >>>>>>> >> > >
>>>>>> > > >>>>>>> >> > > I can look into dropping some combinations without
>>>>>> > > >>>>>>> sacrificing essential coverage. E.g., we can probably
>>>>>> drop the Hadoop
>>>>>> > > >>>>>>> catalog usage in test, as it wasn't recommended for
>>>>>> production use anyway.
>>>>>> > > >>>>>>> >> >
>>>>>> > > >>>>>>> >> > iceberg-cpp skips Actions for draft PRs [1] to
>>>>>> reduce CI
>>>>>> > > >>>>>>> resource
>>>>>> > > >>>>>>> >> > usage a little bit. Perhaps we should apply the same
>>>>>> approach
>>>>>> > > >>>>>>> across
>>>>>> > > >>>>>>> >> > all iceberg subprojects?
>>>>>> > > >>>>>>> >> >
>>>>>> > > >>>>>>> >> > [1] https://github.com/apache/iceberg-cpp/pull/680
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >> I've created a PR to show that, see [1], since it's a
>>>>>> draft, the
>>>>>> > > >>>>>>> CI
>>>>>> > > >>>>>>> >> won't run. If I click the `Ready for review` button,
>>>>>> the actions
>>>>>> > > >>>>>>> will
>>>>>> > > >>>>>>> >> be triggered. Let me know what you think about it.
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >> [1] https://github.com/apache/iceberg/pull/16561
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >> >
>>>>>> > > >>>>>>> >> > >
>>>>>> > > >>>>>>> >> > >
>>>>>> > > >>>>>>> >> > >
>>>>>> > > >>>>>>> >> > > On Fri, May 22, 2026 at 8:22 AM Matt Butrovich <
>>>>>> > > >>>>>>> [email protected]> wrote:
>>>>>> > > >>>>>>> >> > >>
>>>>>> > > >>>>>>> >> > >> Apache DataFusion similarly received this notice.
>>>>>> For
>>>>>> > > >>>>>>> visibility to the Iceberg community, we have tracking
>>>>>> issues to try to
>>>>>> > > >>>>>>> discuss solutions:
>>>>>> > > >>>>>>> >> > >>
>>>>>> > > >>>>>>> >> > >> https://github.com/apache/datafusion/issues/22455
>>>>>> > > >>>>>>> >> > >>
>>>>>> https://github.com/apache/datafusion-comet/issues/4406
>>>>>> > > >>>>>>> >> > >>
>>>>>> > > >>>>>>> >> > >> DataFusion Comet is consuming the vast majority of
>>>>>> > > >>>>>>> DataFusion resources, and like the Iceberg project it's
>>>>>> due to Spark tests
>>>>>> > > >>>>>>> (and Iceberg's Spark tests). We are doing some analysis
>>>>>> on what subsets
>>>>>> > > >>>>>>> might be appropriate for our workflows, features, and
>>>>>> goals, and will share
>>>>>> > > >>>>>>> anything that we think might translate back to the
>>>>>> Iceberg CI workflows.
>>>>>> > > >>>>>>> >> > >>
>>>>>> > > >>>>>>> >> > >> On Fri, May 22, 2026 at 7:43 AM Robert Thomson <
>>>>>> > > >>>>>>> [email protected]> wrote:
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> Hello, Iceberg PMC.
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> In 2024, the ASF introduced the policy for
>>>>>> GitHub Actions
>>>>>> > > >>>>>>> usage
>>>>>> > > >>>>>>> >> > >>> across the foundation[1]. The ASF Github shared
>>>>>> pool of
>>>>>> > > >>>>>>> >> > >>> Github-hosted runners has been at, or very close
>>>>>> to the
>>>>>> > > >>>>>>> limit of
>>>>>> > > >>>>>>> >> > >>> 900 jobs most of the time in the past few weeks
>>>>>> and this is
>>>>>> > > >>>>>>> the
>>>>>> > > >>>>>>> >> > >>> case again today.
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> Your project has been identified as being among
>>>>>> the top 5
>>>>>> > > >>>>>>> consumers of
>>>>>> > > >>>>>>> >> > >>> build time over the past 7 days and we request
>>>>>> that you
>>>>>> > > >>>>>>> bring your
>>>>>> > > >>>>>>> >> > >>> usage down by stream-lining long-running builds.
>>>>>> Contact
>>>>>> > > >>>>>>> Infra for
>>>>>> > > >>>>>>> >> > >>> a consultation if you are unable to streamline
>>>>>> your builds
>>>>>> > > >>>>>>> further.
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> You can use the infra reporting tool[2] to
>>>>>> monitor your GHA
>>>>>> > > >>>>>>> usage as you
>>>>>> > > >>>>>>> >> > >>> work on stream-lining, as well as locate any
>>>>>> bottlenecks in
>>>>>> > > >>>>>>> the workflows.
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> Infra will allow you two weeks time (till the
>>>>>> 8th of June,
>>>>>> > > >>>>>>> 2026) to
>>>>>> > > >>>>>>> >> > >>> progress this, but should you still be above the
>>>>>> limits by
>>>>>> > > >>>>>>> then,
>>>>>> > > >>>>>>> >> > >>> without a viable path forward, we will be
>>>>>> limiting your GHA
>>>>>> > > >>>>>>> usage.
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> Kind regards,
>>>>>> > > >>>>>>> >> > >>> Bob Thomson, on behalf of ASF Infrastructure.
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> > >>> [1]
>>>>>> https://infra.apache.org/github-actions-policy.html
>>>>>> > > >>>>>>> >> > >>> [2]
>>>>>> > > >>>>>>>
>>>>>> https://infra-reports.apache.org/#ghactions&project=iceberg&hours=24&limit=15&group=name
>>>>>> > > >>>>>>> >> > >>>
>>>>>> > > >>>>>>> >> >
>>>>>> > > >>>>>>> >> >
>>>>>> > > >>>>>>> >> > --
>>>>>> > > >>>>>>> >> > Regards
>>>>>> > > >>>>>>> >> > Junwang Zhao
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >>
>>>>>> > > >>>>>>> >> --
>>>>>> > > >>>>>>> >> Regards
>>>>>> > > >>>>>>> >> Junwang Zhao
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>> --
>>>>>> > > >>>>>>> Regards
>>>>>> > > >>>>>>> Junwang Zhao
>>>>>> > > >>>>>>>
>>>>>> > > >>>>>>
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>

Re: Iceberg Consumption of ASF Shared GitHub-hosted Runners

Reply via email to