On 9 February 2018 at 20:39, Thiago Macieira <thiago.macie...@intel.com> wrote:
> On Friday, 9 February 2018 08:32:20 PST Ville Voutilainen wrote:
>> On 9 February 2018 at 18:17, Thiago Macieira <thiago.macie...@intel.com>
>> > We do have BLACKLISTs this time and I complain every time I see one being
>> > added without even an attempt at figuring out what's wrong with the test,
>> > or when the match is overly aggressive ("it fails on Ubuntu in the CI, so
>> > it must
>> It gives me no end of heartburn that we prefer having integrations be
>> blocked for days to doing
>> over-aggressive blacklisting. Having flaky tests is indistinguishable
>> from having no tests at all,
>> so it boggles my mind why some of us are so worried about potentially
>> over-done blacklists.
> I'm not asking someone to spend days figuring out what's wrong. I know it
> takes time.
> But I am asking to do a minimal investigation. In most cases of blacklisting,
> the test has been failing for days, if not months. Spending an hour or two to
> understand why it's failing and whether it's something that only happens in
> the CI should be the norm.
The problem is that a large amount of tests have been failing, for
weeks. In some cases,
months. In some cases, for a year. That results in a restage storm,
and carves a doubt
in every submitter's mind whether CI failures are something to really
bother about beyond
hitting the restage button. I doubt either of us thinks that to be optimal.
> One of the consequences of blacklisting is that "out of sight is out of mind".
> We'll never remove those blacklists again and that makes us have a false sense
> of security, that we have tested, when in fact we're just ignoring the
> failures. That is just like the Qt CI back in 2006-2009, which is what I said
> I don't want to get back to.
Currently the people working on blacklist patches create a bug report
for every flaky test
that they have to blacklist. Whether we believe those bugs to be
is another matter, but having every contributor restage every patch
more than half a dozen
times because there are flaky test failures in tests *COMPLETELY*
unrelated to anything
the submitted change does is unacceptable, and it's high time we rectify
that problem, even if a small percentage of our tests gets
Development mailing list