Great work Jarek!
On 2 February 2020 09:18:52 GMT, Jarek Potiuk <jarek.pot...@polidea.com> wrote: >Ok. The master is fixed now (finally!). The master is now working so >please >rebase all of your open PRs to master. > >At the end we had a number of different problems, some coincidences at >the >same time that’s why it was so hectic and difficult to diagnose: > >- Travis queue was stalled (at some point in time we had some 20 builds > waiting in a queue) so we did not rebase some merges to save time and > merged them from old masters > - Some of the master merges were cancelled - so we could not see which >commit broke the build - that make us come up with different hypothesis >for > the problem >- Our optimisations for CI builds optimisations (skip Kubernetes builds >when no kubernetes-related changes) cause the contrib/example_dags move >to > slip under the radar of PR CI checks >- Even if we did not have the optimisations - Kubernetes Git Sync uses >master of Airflow, so we would not have detected that by PR failure >(only > after merge) > - We had a number of “false positives” and lack of detailed logs for > Kubernetes. >- We had a mysterious hang on kerberos tests - but it was caused likely > by Travis environment change (it’s gone now) >- We had Redis test failures caused by 3.4 release of redis-py >libraries >which contained a change (Redis class became un-hashable by adding >__eq__ > hook) - luckily they reverted it two hours ago ( > https://github.com/andymccurdy/redis-py/blob/master/CHANGES) >- We downloaded Apache RAT tool from a maven repository. And this maven > repo is very unstable recently. > - There are a number of follow-up PRs (already merged or building on >Travis now) that will resolve those problems and prevent it in the >future. > >J. > > >On Thu, Jan 30, 2020 at 11:16 AM Ash Berlin-Taylor <a...@apache.org> >wrote: > >> Spent a little bit of time looking at this and it seems it was >(super) >> flaky tests -- I've managed to get 1 commit back on master passing by >just >> retrying the one failed job. >> >> Looking at the latest commit now. >> >> On Jan 30 2020, at 7:54 am, Jarek Potiuk <jarek.pot...@polidea.com> >wrote: >> > It looks like we have a failing master - seems that yesterday's >Travis' >> > super-slow queue and a number of PRs that were merged without >rebasing >> and >> > caused master to be broken. >> > >> > I will not be at my PC for couple of hours at least so maybe some >other >> > committers can take a look in the meantime. >> > >> > J. >> > >> > -- >> > Jarek Potiuk >> > Polidea <https://www.polidea.com/> | Principal Software Engineer >> > >> > M: +48 660 796 129 <+48660796129> >> > [image: Polidea] <https://www.polidea.com/> >> > >> >> > >-- > >Jarek Potiuk >Polidea <https://www.polidea.com/> | Principal Software Engineer > >M: +48 660 796 129 <+48660796129> >[image: Polidea] <https://www.polidea.com/>