Hey everyone, I know we had quite a long period of flaky tests and accepting the fact that we merge PRs with some tests failing because of the flakiness.
However I think over a couple of months or so we have invested heavily into fixing it - a number of people tracked and fixed a big number of flaky tests and what we have now is mostly "Green". Yeah - sometimes it happens we - by mistake merge a change that causes "main" failure (for example because our test harness is not perfect) but we should fix those cases quickly (mostly by reverting the offending commit and redoing it). But I think we should (and I am talking about committers) stop the case of merging "failed" PRs if we are not absolutely sure that the failure is already fixed in PR (or being fixed) . We had some changes merged recently (and I was as guilty as others) where we merged a "real" failure without properly investigating the root cause. The effect of that is the "broken window" effect - once such PR gets merged, it fails other PRs (until fixed) and it makes people impatient to merge PR with the failure because this is "normal". It should be normal to only merge "green" PRs. I propose that we change our approach and whenever we see a "red" build every committer's approach should be : * investigate the root cause * if it's main - attempt to fix it in main first before merging (could be by reverting the failed commit) * or discuss it in #development /devlist if it is not easy to find * and generally only merge a failed PR if you are absolutely sure the failure has already been fixed (or you know someone works on fixing it) - and ALWAYS comment about it in the PR explaining why you merge failed PR This is a proposal, happy to discuss it if others think differently. WDYT ? J.
