Fantastic ! That might help us to speed up our builds a lot - and we need it looking at the number of PRs we keep on receiving with Airflow 3. Also Self-hosted runners from ASF are not helping - thay are pretty unstable and ARM runners are on ubuntu 20.04 which does not have ARM Python, and the issue is not solved for a long time already https://issues.apache.org/jira/projects/INFRA/issues/INFRA-25990
On Wed, Aug 14, 2024 at 12:05 AM Hussein Awala <huss...@awala.fr> wrote: > I can prepare a short presentation for the next dev call (22 Aug) to > explain the architecture we tried to implement, why we chose it, and what > the blockers are (mainly related to the infrastructure team). > > Also, I am still interested in completing this work. > > On Mon, Aug 12, 2024 at 3:58 AM Kaxil Naik <kaxiln...@gmail.com> wrote: > > > Added as agenda item for the next dev call (22 Aug) > > > > On Mon, 12 Aug 2024 at 00:25, Jarek Potiuk <ja...@potiuk.com> wrote: > > > > > I will let Hussein (if he has time) to share some more details :). > > > > > > Generally speaking we are using Github Actions as CI - so what we > > > **really** need is auto-scaling k8S cluster where K8S Controller is > > deployd > > > and connected (via ASF infrastructure's Github APP) > > > https://github.com/actions/actions-runner-controller. The last state > we > > > had > > > - as far as I remember - Hussein already had a (Terraform?) deployment > > for > > > it and it generally was depending on the ASF/ Infra authorisation / > > setup. > > > Then some fine-tuning / labels (small/medium/big instances) to > > > define/findalize and extend it to be able to also run ARM instances. > > > > > > J. > > > > > > On Mon, Aug 12, 2024 at 1:10 AM Neil <neil4r...@gmail.com> wrote: > > > > > > > I have solid AWS and EKS knowledge, I'd offer my help if my skills > are > > > > applicable. > > > > Which Infrastructure as Code and CI/CD frameworks are being utilized > > for > > > > the testing Terraform Cloudformation? > > > > I've had good experiences with Pulumi python. > > > > Have you considered using EFS to handle the disk space needs? > > > > > > > > On Sun, Aug 11, 2024 at 6:18 PM Jarek Potiuk <ja...@potiuk.com> > wrote: > > > > > > > > > Hello here, > > > > > > > > > > It would be great to have someone (or better two people) to get > > engaged > > > > in > > > > > our test infrastructure work - this will improve everyone's > > experience. > > > > I > > > > > **REALLY** think we should have other people that have engaged so > > far, > > > so > > > > > that we can decrease the bus factor we have for our infrastructure. > > > > > > > > > > Just after I was away for 5 days and without too much connectivity > > our > > > > main > > > > > was broken (lack of disk space for constraints generation) and some > > > mypy > > > > > checks were failing for the last few days. > > > > > > > > > > This is unsustainable and we need to find people who will know and > be > > > > able > > > > > to fix this infrastructure. > > > > > > > > > > *Early warning* - I am planning 3 weeks holidays after Airflow > > Summit - > > > > and > > > > > I won't be looking at my email/github during those days, which > means > > > that > > > > > whoever will be working on Airflow 3 might be severely impacted by > > some > > > > of > > > > > those failures. > > > > > > > > > > Just to remind - until we have the k8S controller set up on our AWS > > > > > account and connected to our repo - we won't be able to use the > > > credits > > > > > that we got recently. So this is a good start. > > > > > > > > > > I created a high-level issue for that > > > > > https://github.com/apache/airflow/issues/41388 and it waits for > some > > > > > volunteers to pick it up. It's a very important thing to do - we > can > > > > speed > > > > > up many parts of our builds (for example release preparation - but > > also > > > > > likely most of our tests) up to 4 times, which means that a lot of > > time > > > > can > > > > > be saved for waiting. > > > > > > > > > > Kaxil - I propose we should add a point at the next devcall - and > > keep > > > it > > > > > as an unresolved Airflow 3 issue until it is well, unresolved. > > > > > > > > > > J. > > > > > > > > > > > > > > >