On Wed, Jul 24, 2019 at 10:01:33AM +0200, Tim Düsterhus wrote:
> Am 24.07.19 um 05:55 schrieb Willy Tarreau:
> > I also noticed the build failure but couldn't find any link to the build
> > history to figure when it started to fail. How did you figure that the
> > commit above was the first one ?
> 
> While I did it as Ilya did by scrolling through GitHub's commit list,

That was the least natural way for me to do it. Thank Ilya for the
screenshot by the way. I clicked on the red cross, the the freebsd
link reporting the failure, and searched the history there but couldn't
find it.

> there is also:
> 
> Travis: https://travis-ci.com/haproxy/haproxy/builds
> Cirrus: https://cirrus-ci.com/github/haproxy/haproxy

Ah yes this one is more useful, that's what I was looking for. I just
cannot figure how to reach it when I'm on the build status page :-/

> Keep in mind for both that only the current head after a push is being
> built, so larger pushes might hide issues to CI.

Of course! But the goal is not to build every single commit either but
to detect early that something went wrong instead of discovering it after
a version is released as we used to in the past.

> In this specific case
> the offending patch was pushed together with 7764a57d3292b6b4f1e488b
> ("BUG/MEDIUM: threads: cpu-map designating a single") and only the
> latter was tested.

Yep!

> > Ideally we'd need a level of failure in CI builds. Some should be just of
> > level "info" and not cause a build error because we'd know they are likely
> > to fail but are still interested in the logs. But I don't think we can do
> > this.
> > 
> 
> I'm not sure this is possible either, but I also don't think it's a good
> idea, because then you get used to this kind of issue and ignore it. For
> example this one would probably have been written off as "ah, it's just
> flaky" instead of actually investigating what's wrong:
> https://github.com/haproxy/haproxy/issues/118

It's true. But what is also true is that the tests are not meant to be
run in the CI build environment but on developers' machines first. Being
able to run in the CI env is a bonus. As a aside effect of some technical
constraints imposed by such environments (slow VMs with flaky timings,
host enforcing at least a little bit of security, etc) we do expect that
some tests will randomly fail. These ones could be tagged as such and
just report a counter of failures among the more or less expected ones.
When you're used to see that 4 to 6 tests usually fail and suddenly you
find 13 that have failed, you can be interested in having a look there,
even if it's possibly to just start it again to confirm. And these ones
should not fail at all in more controlled environments.

There's nothing really problematic here in the end, this just constantly
reminds us that not all tests can be automated.

By the way maybe we could have some form of exclusion for tags instead
of deciding that a test only belongs to one type. Because the reality
is that we do *not* want to run certain tests. The most common ones we
don't want to run locally are "slow" and "bug", which are already
exclusive to each other. But by tagging tests with multiple labels we
could then decide to exclude some labels during the build. And in this
case we could tag some tests as "flaky-on-cirrus", "flaky-on-travis",
"flaky-in-vm", "flaky-in-container", "flaky-firewall" etc and ignore
them in such environments.

Cheers,
Willy

Reply via email to