Re: FreeBSD CI builds fail
On Tue, Jul 23, 2019 at 08:37:37PM +0200, Jerome Magnin wrote: > On Tue, Jul 23, 2019 at 07:09:57PM +0200, Tim Düsterhus wrote: > > Jérôme, > > Ilya, > > > > I noticed that FreeBSD CI fails since > > https://github.com/haproxy/haproxy/commit/885f64fb6da0a349dd3182d21d337b528225c517. > > > > > > One example is here: https://github.com/haproxy/haproxy/runs/169980019 > > > > It should be investigated whether the reg-test is valid for FreeBSD and > > either be fixed or disabled. > > > > Best regards > > Tim Düsterhus > > > Thanks Tim and Ilya, > > This one fails because there's a L4 timeout, I can probably update the regex > to > take that into account, the interesting part is the failure and the step at > which it fails, but for now we expect a connection failure and not a timeout. > > I'm a bit more concerned about the other one reported by Ilya where the > backend > server started by VTest won't accept connection. I'll look into this one > further. We have decided to exclude this test on non Linux system for the time being as it triggers a race condition in VTest. https://github.com/haproxy/haproxy/commit/0d00b544c3bdc9dc1796aca28bad46b3c1867184 Jérôme
Re: FreeBSD CI builds fail
On Wed, Jul 24, 2019 at 10:01:33AM +0200, Tim Düsterhus wrote: > Am 24.07.19 um 05:55 schrieb Willy Tarreau: > > I also noticed the build failure but couldn't find any link to the build > > history to figure when it started to fail. How did you figure that the > > commit above was the first one ? > > While I did it as Ilya did by scrolling through GitHub's commit list, That was the least natural way for me to do it. Thank Ilya for the screenshot by the way. I clicked on the red cross, the the freebsd link reporting the failure, and searched the history there but couldn't find it. > there is also: > > Travis: https://travis-ci.com/haproxy/haproxy/builds > Cirrus: https://cirrus-ci.com/github/haproxy/haproxy Ah yes this one is more useful, that's what I was looking for. I just cannot figure how to reach it when I'm on the build status page :-/ > Keep in mind for both that only the current head after a push is being > built, so larger pushes might hide issues to CI. Of course! But the goal is not to build every single commit either but to detect early that something went wrong instead of discovering it after a version is released as we used to in the past. > In this specific case > the offending patch was pushed together with 7764a57d3292b6b4f1e488b > ("BUG/MEDIUM: threads: cpu-map designating a single") and only the > latter was tested. Yep! > > Ideally we'd need a level of failure in CI builds. Some should be just of > > level "info" and not cause a build error because we'd know they are likely > > to fail but are still interested in the logs. But I don't think we can do > > this. > > > > I'm not sure this is possible either, but I also don't think it's a good > idea, because then you get used to this kind of issue and ignore it. For > example this one would probably have been written off as "ah, it's just > flaky" instead of actually investigating what's wrong: > https://github.com/haproxy/haproxy/issues/118 It's true. But what is also true is that the tests are not meant to be run in the CI build environment but on developers' machines first. Being able to run in the CI env is a bonus. As a aside effect of some technical constraints imposed by such environments (slow VMs with flaky timings, host enforcing at least a little bit of security, etc) we do expect that some tests will randomly fail. These ones could be tagged as such and just report a counter of failures among the more or less expected ones. When you're used to see that 4 to 6 tests usually fail and suddenly you find 13 that have failed, you can be interested in having a look there, even if it's possibly to just start it again to confirm. And these ones should not fail at all in more controlled environments. There's nothing really problematic here in the end, this just constantly reminds us that not all tests can be automated. By the way maybe we could have some form of exclusion for tags instead of deciding that a test only belongs to one type. Because the reality is that we do *not* want to run certain tests. The most common ones we don't want to run locally are "slow" and "bug", which are already exclusive to each other. But by tagging tests with multiple labels we could then decide to exclude some labels during the build. And in this case we could tag some tests as "flaky-on-cirrus", "flaky-on-travis", "flaky-in-vm", "flaky-in-container", "flaky-firewall" etc and ignore them in such environments. Cheers, Willy
Re: FreeBSD CI builds fail
Willy, Am 24.07.19 um 05:55 schrieb Willy Tarreau: > I also noticed the build failure but couldn't find any link to the build > history to figure when it started to fail. How did you figure that the > commit above was the first one ? While I did it as Ilya did by scrolling through GitHub's commit list, there is also: Travis: https://travis-ci.com/haproxy/haproxy/builds Cirrus: https://cirrus-ci.com/github/haproxy/haproxy Keep in mind for both that only the current head after a push is being built, so larger pushes might hide issues to CI. In this specific case the offending patch was pushed together with 7764a57d3292b6b4f1e488b ("BUG/MEDIUM: threads: cpu-map designating a single") and only the latter was tested. >> This one fails because there's a L4 timeout, I can probably update the regex >> to >> take that into account, the interesting part is the failure and the step at >> which it fails, but for now we expect a connection failure and not a timeout. > > There's always the possibility (especially in CI environments) that some > rules are in place on the system to prevent connections to unexpected ports. > > Ideally we'd need a level of failure in CI builds. Some should be just of > level "info" and not cause a build error because we'd know they are likely > to fail but are still interested in the logs. But I don't think we can do > this. > I'm not sure this is possible either, but I also don't think it's a good idea, because then you get used to this kind of issue and ignore it. For example this one would probably have been written off as "ah, it's just flaky" instead of actually investigating what's wrong: https://github.com/haproxy/haproxy/issues/118 Best regards Tim Düsterhus
Re: FreeBSD CI builds fail
ср, 24 июл. 2019 г. в 08:55, Willy Tarreau : > Hi guys, > > On Tue, Jul 23, 2019 at 08:37:37PM +0200, Jerome Magnin wrote: > > On Tue, Jul 23, 2019 at 07:09:57PM +0200, Tim Düsterhus wrote: > > > Jérôme, > > > Ilya, > > > > > > I noticed that FreeBSD CI fails since > > > > https://github.com/haproxy/haproxy/commit/885f64fb6da0a349dd3182d21d337b528225c517 > . > > > > > > > > > One example is here: https://github.com/haproxy/haproxy/runs/169980019 > > I also noticed the build failure but couldn't find any link to the build > history to figure when it started to fail. How did you figure that the > commit above was the first one ? > [image: Screenshot from 2019-07-24 11-43-30.png] > > > This one fails because there's a L4 timeout, I can probably update the > regex to > > take that into account, the interesting part is the failure and the step > at > > which it fails, but for now we expect a connection failure and not a > timeout. > > There's always the possibility (especially in CI environments) that some > rules are in place on the system to prevent connections to unexpected > ports. > > Ideally we'd need a level of failure in CI builds. Some should be just of > level "info" and not cause a build error because we'd know they are likely > to fail but are still interested in the logs. But I don't think we can do > this. > > Willy >
Re: FreeBSD CI builds fail
Hi guys, On Tue, Jul 23, 2019 at 08:37:37PM +0200, Jerome Magnin wrote: > On Tue, Jul 23, 2019 at 07:09:57PM +0200, Tim Düsterhus wrote: > > Jérôme, > > Ilya, > > > > I noticed that FreeBSD CI fails since > > https://github.com/haproxy/haproxy/commit/885f64fb6da0a349dd3182d21d337b528225c517. > > > > > > One example is here: https://github.com/haproxy/haproxy/runs/169980019 I also noticed the build failure but couldn't find any link to the build history to figure when it started to fail. How did you figure that the commit above was the first one ? > This one fails because there's a L4 timeout, I can probably update the regex > to > take that into account, the interesting part is the failure and the step at > which it fails, but for now we expect a connection failure and not a timeout. There's always the possibility (especially in CI environments) that some rules are in place on the system to prevent connections to unexpected ports. Ideally we'd need a level of failure in CI builds. Some should be just of level "info" and not cause a build error because we'd know they are likely to fail but are still interested in the logs. But I don't think we can do this. Willy
Re: FreeBSD CI builds fail
On Tue, Jul 23, 2019 at 07:09:57PM +0200, Tim Düsterhus wrote: > Jérôme, > Ilya, > > I noticed that FreeBSD CI fails since > https://github.com/haproxy/haproxy/commit/885f64fb6da0a349dd3182d21d337b528225c517. > > > One example is here: https://github.com/haproxy/haproxy/runs/169980019 > > It should be investigated whether the reg-test is valid for FreeBSD and > either be fixed or disabled. > > Best regards > Tim Düsterhus > Thanks Tim and Ilya, This one fails because there's a L4 timeout, I can probably update the regex to take that into account, the interesting part is the failure and the step at which it fails, but for now we expect a connection failure and not a timeout. I'm a bit more concerned about the other one reported by Ilya where the backend server started by VTest won't accept connection. I'll look into this one further. Jérôme
FreeBSD CI builds fail
Jérôme, Ilya, I noticed that FreeBSD CI fails since https://github.com/haproxy/haproxy/commit/885f64fb6da0a349dd3182d21d337b528225c517. One example is here: https://github.com/haproxy/haproxy/runs/169980019 It should be investigated whether the reg-test is valid for FreeBSD and either be fixed or disabled. Best regards Tim Düsterhus