I re-enabled all the optimizations yesterday - as of this morning all PRs
that are only touching some "areas" of the code - should only run relevant
tests and nothing else - so they should be faster than usual. We are still
fighting with the "job limit" the whole ASF has and we applied for some
credits to build the self-hosted runners, but this should help a lot.

An example here: this build by Kaxil - once started - completed under 1m30s
- and did not use precious resources from the job queue:
https://github.com/apache/airflow/pull/11782#pullrequestreview-515783657

J.

On Sun, Oct 18, 2020 at 9:43 PM Jarek Potiuk <[email protected]>
wrote:

> Just to let everyone know - our quest on stabilising/speeding up the CI
> continues.
>
> I just merged one of the final (yeah final final final ...) optimization,
> where just a typo correction in .md files /non-doc .rst files should take ~
> 1m to complete.
>
> Yep. You read that right. Some simple changes will not trigger a full test
> suite, just a small relevant subset of which should be really fast.
>
> We will observe it and will see if we need to adjust it and fix any other
> teething issues, but I hope this will be helpful in fighting the current
> job limits we have in the whole Apache organisation before we - hopefully -
> get self-hoster runners in place.
>
> J.
>
> On Tue, Oct 13, 2020 at 9:19 PM Jarek Potiuk <[email protected]>
> wrote:
>
>> I do expect some small teething problems again, but I hope the big one is
>> over and I will try to address those problems if they arise. Apologies for
>> that - this was rather difficult to test on "Apache Organization" scale. We
>> are also talking about adding some github custom runners, because we expect
>> the situation will deteriorate in the future if we don't.
>>
>> On Tue, Oct 13, 2020 at 9:14 PM Jarek Potiuk <[email protected]>
>> wrote:
>>
>>> There is a bad news and a good news :).
>>>
>>> * The bad one is that the change did not go well with its original
>>> scope. It turned out that many small jobs are not a good idea when you have
>>> 180 slots in a queue and a number (growing) of Apache projects and yours
>>> are competing for those. Seems that our jobs got starved a lot and the
>>> effect was 2-3 hours waiting queues which were growing afternoon when US
>>> started to wake up.
>>>
>>> * The good one is that I just merged a fix to that  - instead of many
>>> small jobs, we grouped several test types in single jobs and we clean-up
>>> between the jobs and reusing the machines. I believe this will be even more
>>> optimized, and uses the same concepts of optimization as before.
>>>
>>> I cancelled all the queued builds and asked people to rebase to latest
>>> master. If you have not done so yet - please do it now!
>>>
>>> J.
>>>
>>>
>>> On Mon, Oct 12, 2020 at 1:27 AM Daniel Imberman <
>>> [email protected]> wrote:
>>>
>>>> Thanks Jarek! This was much needed and should lead to a cleaner dev
>>>> process
>>>>
>>>> via Newton Mail
>>>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2>
>>>>
>>>> On Sun, Oct 11, 2020 at 3:37 PM, Jarek Potiuk <[email protected]>
>>>> wrote:
>>>>
>>>> Hello everyone,
>>>>
>>>> I have really high hopes for the CI change that we implemented over the
>>>> weekend. Last few weeks we experienced a lot of stability problems with the
>>>> CI, and our builds were rarely "green" - and mostly due to
>>>> intermittent/unrelated problems. We've implemented some workarounds and
>>>> splitting to a bigger number of smaller jobs that so far has proven to be
>>>> much more stable and "greener",
>>>>
>>>> You will see a much bigger number of test checks than you used to (up
>>>> to 120 or so), but they will be quite a bit faster. Also - if any of the
>>>> checks fail fo a good reason, you should be able to find information on how
>>>> to reproduce the failures locally in the test output - so that you can fix
>>>> it.
>>>>
>>>> We will be watching and fixing any teething problems over the next few
>>>> days, but for now - please rebase to the latest master and try it out.
>>>>
>>>> J.
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to