Hi folks,
The following email was posted to builds@ today and might contain
something
relevant to reducing our GitHub runners? Forwarded message below...
[1]
https://lists.apache.org/thread/pnvt9b80dnovlqmrf5n10ylcf9q3pcxq
---------- Forwarded message ---------
From: Lari Hotari <lhot...@apache.org>
Date: Tue, Oct 22, 2024 at 7:08 AM
Subject: Sharing Apache Pulsar's CI solution for Docker image sharing with
GitHub Actions Artifacts within a single workflow
To: <bui...@apache.org>
Hi all,
Just in case it's useful for someone else, in Apache Pulsar, there's a
GitHub Actions-based CI workflow that creates a Docker image and runs
integration tests and system tests with it. In Pulsar, we have an extremely
large Docker image for system tests; it's over 1.7GiB when compressed with
zstd. Building this image takes over 20 minutes, so we want to share the
image within a single build workflow. GitHub Artifacts are the recommended
way to share files between jobs in a single workflow, as explained in the
GitHub Actions documentation:
https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/storing-and-sharing-data-from-a-workflow
.
To share the Docker image within a single build workflow, we use GitHub
Artifacts upload/download with a custom CLI tool that uses the
GitHub-provided JavaScript libraries for interacting with the GitHub
Artifacts backend API. The benefit of the CLI tool for GitHub Actions
Artifacts is that it can upload from stdin and download to stdout. Sharing
the Docker images in the GitHub Actions workflow is simply done with the
CLI tool and standard "docker load" and "docker save" commands.
These are the shell script functions that Apache Pulsar uses:
https://github.com/apache/pulsar/blob/1344167328c31ea39054ec2a6019f003fb8bab50/build/pulsar_ci_tool.sh#L82-L101
In Pulsar CI, the command for saving the image is:
docker save ${image} | zstd | pv -ft -i 5 | pv -Wbaf -i 5 | timeout 20m
gh-actions-artifact-client.js upload
--retentionDays=$ARTIFACT_RETENTION_DAYS "${artifactname}"
For restoring, the command used is:
timeout 20m gh-actions-artifact-client.js download "${artifactname}" | pv
-batf -i 5 | unzstd | docker load
The throughput is very impressive. Transfer speed can exceed 180MiB/s when
uploading the Docker image, and downloads are commonly over 100MiB/s in
apache/pulsar builds. It's notable that the transfer includes the execution
of "docker load" and "docker save" since it's directly operating on stdin
and stdout.
Examples:
upload:
https://github.com/apache/pulsar/actions/runs/11454093832/job/31880154863#step:15:26
download:
https://github.com/apache/pulsar/actions/runs/11454093832/job/31880164467#step:9:20
Since GitHub Artifacts doesn't provide an official CLI tool, I have written
a GitHub Action for that purpose. It's available at
https://github.com/lhotari/gh-actions-artifact-client.
When you use the action, it will install the CLI tool available as
"gh-actions-artifact-client.js" in the PATH of the runner so that it's
available in subsequent build steps. In Apache Pulsar, we fork external
actions to our own repository, so we use the version forked to
https://github.com/apache/pulsar-test-infra.
In Pulsar, we have been using this solution successfully for several years.
I recently upgraded the action to support the GitHub Actions Artifacts API
v4, as earlier API versions will be removed after November 30th.
I hope this helps other projects that face similar CI challenges as Pulsar
has. Please let me know if you need any help in using a similar solution
for your Apache project's CI.
-Lari
(end of forwarded message)
WDYT? Relevant to us?
Cheers,
Nathan
On Thu, Oct 17, 2024 at 2:10 AM Lee, Lup Yuen <lu...@appkaki.com> wrote:
Hi All: We have an ultimatum to reduce (drastically) our usage of GitHub
Actions. Or our Continuous Integration will halt totally in Two Weeks.
Here's what I'll implement within 24 hours for `nuttx` and `nuttx-apps`
repos:
(1) When we submit or update a Complex PR that affects All Architectures
(Arm, RISC-V, Xtensa, etc): CI Workflow shall run only half the jobs.
Previously CI Workflow will run `arm-01` to `arm-14`, now we will run
only
`arm-01` to `arm-07`. (This will reduce GitHub Cost by 32%)
(2) When the Complex PR is Merged: CI Workflow will still run all jobs
`arm-01` to `arm-14`
(3) For NuttX Admins: We shall have only Four Scheduled Merge Jobs per
day.
Which means I shall quickly cancel any Merge Jobs that appear. Then at
00:00 / 06:00 / 12:00 / 18:00 UTC: I shall restart the Latest Merge Job
that I cancelled. (This will reduce GitHub Cost by 17%)
(4) macOS and Windows Jobs (msys2 / msvc): They shall be totally disabled
until we find a way to manage their costs. (GitHub charges 10x premium
for
macOS runners, 2x premium for Windows runners!)
We have done an Analysis of CI Jobs over the past 24 hours:
- Many CI Jobs are Incomplete: We waste GitHub Runners on jobs that
eventually get superseded and cancelled
- When we Half the CI Jobs: We reduce the wastage of GitHub Runners
- Scheduled Merge Jobs will also reduce wastage of GitHub Runners, since
most Merge Jobs don't complete (only 1 completed yesterday)
Please check out the analysis below. And let's discuss further in this
NuttX Issue. Thanks!
https://github.com/apache/nuttx/issues/14376
Lup
---------- Forwarded message ---------
From: Daniel Gruno <humbed...@apache.org>
Date: Wed, Oct 16, 2024 at 12:08 PM
Subject: [WARNING] All NuttX builds to be turned off by October 30th
UNLESS...
To: <priv...@nuttx.apache.org>
Cc: ASF Infrastructure <priv...@infra.apache.org>
Hello again, NuttX folks.
This is a formal notice that your CI builds are far exceeding the
maximum resource use set out by our CI policies[1]. As you are
currently
exceeding your limits by more than 300%[2] and have not shown any
signs
of decreasing, we will be disabling GitHub Actions for your project on
October 30th unless you manage to get the usage under control and
below
the established limits of 25 full-time runners in a single week.
If you have any further questions, feel free to reach out to us at
priv...@infra.apache.org
With regards,
Daniel on behalf of ASF Infra.
[1] https://infra.apache.org/github-actions-policy.html
[2] https://infra-reports.apache.org/#ghactions&project=nuttx