On Wed, Jul 24, 2019 at 10:01:33AM +0200, Tim Düsterhus wrote: > Am 24.07.19 um 05:55 schrieb Willy Tarreau: > > I also noticed the build failure but couldn't find any link to the build > > history to figure when it started to fail. How did you figure that the > > commit above was the first one ? > > While I did it as Ilya did by scrolling through GitHub's commit list,
That was the least natural way for me to do it. Thank Ilya for the screenshot by the way. I clicked on the red cross, the the freebsd link reporting the failure, and searched the history there but couldn't find it. > there is also: > > Travis: https://travis-ci.com/haproxy/haproxy/builds > Cirrus: https://cirrus-ci.com/github/haproxy/haproxy Ah yes this one is more useful, that's what I was looking for. I just cannot figure how to reach it when I'm on the build status page :-/ > Keep in mind for both that only the current head after a push is being > built, so larger pushes might hide issues to CI. Of course! But the goal is not to build every single commit either but to detect early that something went wrong instead of discovering it after a version is released as we used to in the past. > In this specific case > the offending patch was pushed together with 7764a57d3292b6b4f1e488b > ("BUG/MEDIUM: threads: cpu-map designating a single") and only the > latter was tested. Yep! > > Ideally we'd need a level of failure in CI builds. Some should be just of > > level "info" and not cause a build error because we'd know they are likely > > to fail but are still interested in the logs. But I don't think we can do > > this. > > > > I'm not sure this is possible either, but I also don't think it's a good > idea, because then you get used to this kind of issue and ignore it. For > example this one would probably have been written off as "ah, it's just > flaky" instead of actually investigating what's wrong: > https://github.com/haproxy/haproxy/issues/118 It's true. But what is also true is that the tests are not meant to be run in the CI build environment but on developers' machines first. Being able to run in the CI env is a bonus. As a aside effect of some technical constraints imposed by such environments (slow VMs with flaky timings, host enforcing at least a little bit of security, etc) we do expect that some tests will randomly fail. These ones could be tagged as such and just report a counter of failures among the more or less expected ones. When you're used to see that 4 to 6 tests usually fail and suddenly you find 13 that have failed, you can be interested in having a look there, even if it's possibly to just start it again to confirm. And these ones should not fail at all in more controlled environments. There's nothing really problematic here in the end, this just constantly reminds us that not all tests can be automated. By the way maybe we could have some form of exclusion for tags instead of deciding that a test only belongs to one type. Because the reality is that we do *not* want to run certain tests. The most common ones we don't want to run locally are "slow" and "bug", which are already exclusive to each other. But by tagging tests with multiple labels we could then decide to exclude some labels during the build. And in this case we could tag some tests as "flaky-on-cirrus", "flaky-on-travis", "flaky-in-vm", "flaky-in-container", "flaky-firewall" etc and ignore them in such environments. Cheers, Willy

