Great work Jarek!

On 2 February 2020 09:18:52 GMT, Jarek Potiuk <jarek.pot...@polidea.com> wrote:
>Ok. The master is fixed now (finally!). The master is now working so
>please
>rebase all of your open PRs to master.
>
>At the end we had a number of different problems, some coincidences  at
>the
>same time that’s why it was so hectic and difficult to diagnose:
>
>- Travis queue was stalled (at some point in time we had some 20 builds
>  waiting in a queue) so we did not rebase some merges to save time and
>   merged them  from old masters
> - Some of the master merges were cancelled - so we could not see which
>commit broke the build - that make us come up with different hypothesis
>for
>   the problem
>- Our optimisations for CI builds optimisations (skip Kubernetes builds
>when no kubernetes-related changes) cause the contrib/example_dags move
>to
>   slip under the radar of PR CI checks
>- Even if we did not have the optimisations -  Kubernetes Git Sync uses
>master of Airflow, so we would not have detected that by PR failure
>(only
>   after merge)
>   - We had a number of “false positives” and lack of detailed logs for
>   Kubernetes.
>- We had a mysterious hang on kerberos tests - but it was caused likely
>   by Travis environment change (it’s gone now)
>- We had Redis test failures caused by 3.4 release of redis-py
>libraries
>which contained a change (Redis class became un-hashable by adding
>__eq__
>   hook) - luckily they reverted it two hours ago (
>   https://github.com/andymccurdy/redis-py/blob/master/CHANGES)
>- We downloaded Apache RAT tool from a maven repository. And this maven
>   repo is very unstable recently.
>   - There are a number of follow-up PRs (already merged or building on
>Travis now)  that will resolve those problems and prevent it in the
>future.
>
>J.
>
>
>On Thu, Jan 30, 2020 at 11:16 AM Ash Berlin-Taylor <a...@apache.org>
>wrote:
>
>> Spent a little bit of time looking at this and it seems it was
>(super)
>> flaky tests -- I've managed to get 1 commit back on master passing by
>just
>> retrying the one failed job.
>>
>> Looking at the latest commit now.
>>
>> On Jan 30 2020, at 7:54 am, Jarek Potiuk <jarek.pot...@polidea.com>
>wrote:
>> > It looks like we have a failing master - seems that yesterday's
>Travis'
>> > super-slow queue and a number of PRs that were merged without
>rebasing
>> and
>> > caused master to be broken.
>> >
>> > I will not be at my PC for couple of hours at least so maybe some
>other
>> > committers can take a look in the meantime.
>> >
>> > J.
>> >
>> > --
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>>
>
>-- 
>
>Jarek Potiuk
>Polidea <https://www.polidea.com/> | Principal Software Engineer
>
>M: +48 660 796 129 <+48660796129>
>[image: Polidea] <https://www.polidea.com/>

Reply via email to