Hello everyone,

TL;DR: I wanted to make a proposal to move our CI runners from our own
"custom" implementation developed mostly by Ash and based on VMs to a newly
released Auto-scaling K8S controller that was developed for Apache Arrow by
Voltron Data.

I was in contact with Jacob Wujciak who lead the effort in Arrow - and we
were also discussing it at the latest ASF build meeting (BTW. Jacob was
just approved as an Arrow committer) and I think they have a solid and
proven solution, very well documented and working together with the ASF
GitHub application that was implemented to distribute ephemeral tokens
needed to run the runners.  We would likely keep using Ash's runner for
security but this can be easily done in the solution from Voltron Data.

Why would we want to do it?

We wanted to switch from our implementation for quite some time already as
what we have is somewhat brittle and rather complex - including multiple
AWS-specific technologies (and is our code that we have to maintain in
https://github.com/apache/airflow-ci-infra). Actually the fact that we use
AWS-specific technologies, was one of the reasons we could not use easily
Google Cloud Platform Credits for CI even if they were offered to us in the
past.

I am afraid only Ash knows most of the ins-outs of the scaling code (though
both myself and Kaxil were able to fix some stuff and I added a lot of
stuff in packer-based installation).

While the current solution is very stable, we sometimes get "job not
started" problems and sometimes we have to manually "push" Auto-scaling to
work. K8S-based auto-scaling controller is as good as it gets, and we have
good relationship with Arrow team and Jacob so we can expect a decent help
and cooperation - they will also implement them in very similar setup to
ours (with ASF tokens) so our use case will be handled well. Also choosing
K8S controller makes it easy to move between clouds or even possible to run
it on multiple clouds.

The Discussion on Arrow devlist about it:

https://lists.apache.org/thread/mskpqwpdq65t1wpj4f5klfq9217ljodw

If this will seem like a good idea, I will work on it likely around the end
of year and if anyone would like ot help with it, I will be more than happy
for others to join me - volunteers are most welcome - so that we will have
more hands and eyes knowledgeable about the setup.

J.

Reply via email to