Re: Docker rate limits likely spell DOOM for any Apache project CI workflow relying on Docker Hub

Joan Touzet Mon, 02 Nov 2020 13:16:51 -0800

Hey Gavin,

To avoid the rate limiting, this means that we need to bake CIcredentials into jobs for accounts inside of the apache org. Thosecredentials need to be used for all `docker pull` commands.


How can we do this in a way that complies with ASF Infra policy?

Thanks,
Joan "the battle wages on / for Toy Soldiers" Touzet


On 2020-11-02 4:57 a.m., Gavin McDonald wrote:

Hi All,

Any project under the 'apache' org on DockerHub are not affected by the
restrictions.

Kind Regards

Gavin "The futures so bright you gotta wear shades" McDonald


On Thu, Oct 29, 2020 at 11:08 PM Gavin McDonald <gmcdon...@apache.org>
wrote:

Hi ,

Just to note I have emailed DockerHub, asking for clarification on our
account and what our benefits are.


On Thu, Oct 29, 2020 at 6:34 PM Allen Wittenauer
<a...@effectivemachines.com.invalid> wrote:

On Oct 29, 2020, at 9:21 AM, Joan Touzet <woh...@apache.org> wrote:

(Sidebar about the script's details)


         Sure.

I tried to read the shell script, but I'm not in the headspace to fully

parse it at the moment. If I'm understanding correctly, this will still
catch CouchDB's CI docker images if they haven't changed in a week, which
happens often enough, negating the cache.

         Correct. We actually tried something similar for a while and
discovered that in a lot of cases, upstream packages would disappear (or
worse, have security problems) thus making it look the image is still
"good" when it's not.  So a rebuild weekly at least guarantees some level
of "yup, still good" without having too much of a negative impact.

As a project, we're kind of stuck between a rock and a hard place. We

want to force a docker pull on the base CI image if it's out of date or the
image is corrupted. Otherwise we want to cache forever, not just for a
week. I can probably manage the "do we need to re-pull?" bit with some
clever CI scripting (check for the latest image hash locally, validate the
local image, pull if either fails) but I don't understand how the script
resolves the latter.

         Most projects that use Yetus for their actual CI testing build
the image used for the CI as part of the CI.  It is a multi-stage,
multi-file docker build that has each run use a 'base' Dockerfile (provided
by the project) that rarely changed and a per-run file that Yetus generates
on the fly, with both images tagged by either git sha or branch (depending
upon context). Due to how docker image reference counts on the layers work,
this makes the docker images effectively used as a "rolling cache" and
(beyond a potential weekly cache removal) full builds are rare.. thus
making them relatively cheap (typically <1m runtime) unless the base image
had a change far up the chain (so structure wisely).  Of course, this also
tests the actual image of the CI build as part of the CI.  (What tests the
testers? philosophy)   Given that Jenkins tries really hard to have job
affinity, re-runs were still cheap after the initial one. [Ofc, now that
the cache is getting nuked every day....]

         Actually, looking at some of the ci-hadoop jobs, it looks like
yetus is managing the cache on them.  I'm seeing individual run containers
from days ago at least.  So that's a good sign.

Can a exemption list be passed to the script so that images matching a

certain regex are excluded? You say the script ignores labels entirely, so
perhaps not...

         Patches accepted. ;)

         FWIW, I've been testing on my local machine for unrelated reasons
and I keep blowing away running containers I care about so I might end up
adding it myself.  That said: the code was specifically built for CI
systems where the expectation should be that nothing is permanent.


--

*Gavin McDonald*
Systems Administrator
ASF Infrastructure Team

Re: Docker rate limits likely spell DOOM for any Apache project CI workflow relying on Docker Hub

Reply via email to