I re-enabled all the optimizations yesterday - as of this morning all PRs that are only touching some "areas" of the code - should only run relevant tests and nothing else - so they should be faster than usual. We are still fighting with the "job limit" the whole ASF has and we applied for some credits to build the self-hosted runners, but this should help a lot.
An example here: this build by Kaxil - once started - completed under 1m30s - and did not use precious resources from the job queue: https://github.com/apache/airflow/pull/11782#pullrequestreview-515783657 J. On Sun, Oct 18, 2020 at 9:43 PM Jarek Potiuk <[email protected]> wrote: > Just to let everyone know - our quest on stabilising/speeding up the CI > continues. > > I just merged one of the final (yeah final final final ...) optimization, > where just a typo correction in .md files /non-doc .rst files should take ~ > 1m to complete. > > Yep. You read that right. Some simple changes will not trigger a full test > suite, just a small relevant subset of which should be really fast. > > We will observe it and will see if we need to adjust it and fix any other > teething issues, but I hope this will be helpful in fighting the current > job limits we have in the whole Apache organisation before we - hopefully - > get self-hoster runners in place. > > J. > > On Tue, Oct 13, 2020 at 9:19 PM Jarek Potiuk <[email protected]> > wrote: > >> I do expect some small teething problems again, but I hope the big one is >> over and I will try to address those problems if they arise. Apologies for >> that - this was rather difficult to test on "Apache Organization" scale. We >> are also talking about adding some github custom runners, because we expect >> the situation will deteriorate in the future if we don't. >> >> On Tue, Oct 13, 2020 at 9:14 PM Jarek Potiuk <[email protected]> >> wrote: >> >>> There is a bad news and a good news :). >>> >>> * The bad one is that the change did not go well with its original >>> scope. It turned out that many small jobs are not a good idea when you have >>> 180 slots in a queue and a number (growing) of Apache projects and yours >>> are competing for those. Seems that our jobs got starved a lot and the >>> effect was 2-3 hours waiting queues which were growing afternoon when US >>> started to wake up. >>> >>> * The good one is that I just merged a fix to that - instead of many >>> small jobs, we grouped several test types in single jobs and we clean-up >>> between the jobs and reusing the machines. I believe this will be even more >>> optimized, and uses the same concepts of optimization as before. >>> >>> I cancelled all the queued builds and asked people to rebase to latest >>> master. If you have not done so yet - please do it now! >>> >>> J. >>> >>> >>> On Mon, Oct 12, 2020 at 1:27 AM Daniel Imberman < >>> [email protected]> wrote: >>> >>>> Thanks Jarek! This was much needed and should lead to a cleaner dev >>>> process >>>> >>>> via Newton Mail >>>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2> >>>> >>>> On Sun, Oct 11, 2020 at 3:37 PM, Jarek Potiuk <[email protected]> >>>> wrote: >>>> >>>> Hello everyone, >>>> >>>> I have really high hopes for the CI change that we implemented over the >>>> weekend. Last few weeks we experienced a lot of stability problems with the >>>> CI, and our builds were rarely "green" - and mostly due to >>>> intermittent/unrelated problems. We've implemented some workarounds and >>>> splitting to a bigger number of smaller jobs that so far has proven to be >>>> much more stable and "greener", >>>> >>>> You will see a much bigger number of test checks than you used to (up >>>> to 120 or so), but they will be quite a bit faster. Also - if any of the >>>> checks fail fo a good reason, you should be able to find information on how >>>> to reproduce the failures locally in the test output - so that you can fix >>>> it. >>>> >>>> We will be watching and fixing any teething problems over the next few >>>> days, but for now - please rebase to the latest master and try it out. >>>> >>>> J. >>>> >>>> -- >>>> >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>> >>> -- >>> >>> Jarek Potiuk >>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>> >>> M: +48 660 796 129 <+48660796129> >>> [image: Polidea] <https://www.polidea.com/> >>> >>> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
