Hi David, I agree it'll be great if we could distinguish between "flakyFailure" and "alwaysFailure". And I also agree we should still show the "flakyFailure" tests because there could be some potential bugs there.
Thanks for working on it. Luke On Tue, Feb 13, 2024 at 12:57 AM Ismael Juma <m...@ismaeljuma.com> wrote: > Sounds good. I am supportive of this change. > > Ismael > > On Mon, Feb 12, 2024 at 7:43 AM David Jacot <dja...@confluent.io.invalid> > wrote: > > > Hi Bruno, > > > > Yes, you're right. Sorry for the typo. > > > > Hi Ismael, > > > > You're right. Jenkins does not support the flakyFailure element and > > hence the information is not at all in the Jenkins report. I am still > > experimenting with printing the flaky tests somewhere. I will update this > > thread if I get something working. In the meantime, I wanted to gauge > > whether there is support for it. > > > > Cheers, > > David > > > > On Mon, Feb 12, 2024 at 3:59 PM Ismael Juma <m...@ismaeljuma.com> wrote: > > > > > Hi David, > > > > > > Your message didn't make this clear, but you are saying that Jenkins > does > > > _not_ support the flakyFailure element and hence this information will > be > > > completely missing from the Jenkins report. Have we considered > including > > > the flakyFailure information ourselves? I have seen that being done and > > it > > > seems strictly better than totally ignoring it. > > > > > > Ismael > > > > > > On Mon, Feb 12, 2024 at 12:11 AM David Jacot > <dja...@confluent.io.invalid > > > > > > wrote: > > > > > > > Hi folks, > > > > > > > > I have been playing with `reports.junitXml.mergeReruns` setting in > > gradle > > > > [1]. From the gradle doc: > > > > > > > > > When mergeReruns is enabled, if a test fails but is then retried > and > > > > succeeds, its failures will be recorded as <flakyFailure> instead of > > > > <failure>, within one <testcase>. This is effectively the reporting > > > > produced by the surefire plugin of Apache Maven™ when enabling > reruns. > > If > > > > your CI server understands this format, it will indicate that the > test > > > was > > > > flaky. If it does not, it will indicate that the test succeeded as it > > > will > > > > ignore the <flakyFailure> information. If the test does not succeed > > (i.e. > > > > it fails for every retry), it will be indicated as having failed > > whether > > > > your tool understands this format or not. > > > > > > > > With this, we get really close to having green builds [2] all the > time. > > > > There are only a few tests which are too flaky. We should address or > > > > disable those. > > > > > > > > I think that this would help us a lot because it would reduce the > noise > > > > that we get in pull requests. At the moment, there are just too many > > > failed > > > > tests reported so it is really hard to know whether a pull request is > > > > actually fine or not. > > > > > > > > [1] applies it to both unit and integration tests. Following the > > > discussion > > > > in the `github build queue` thread, it may be better to only apply it > > to > > > > the integration tests. Being stricter with unit tests would make > sense. > > > > > > > > This does not mean that we should continue our effort to reduce the > > > number > > > > of flaky tests. For this, I propose to keep using Gradle Entreprise. > It > > > > provides a nice report for them that we can leverage. > > > > > > > > Thoughts? > > > > > > > > Best, > > > > David > > > > > > > > [1] https://github.com/apache/kafka/pull/14862 > > > > [2] > > > > > > > > > > > > > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests > > > > > > > > > >