Re: Improve flaky test reporting (KAFKA-12216)
Hi David, I agree it'll be great if we could distinguish between "flakyFailure" and "alwaysFailure". And I also agree we should still show the "flakyFailure" tests because there could be some potential bugs there. Thanks for working on it. Luke On Tue, Feb 13, 2024 at 12:57 AM Ismael Juma wrote: > Sounds good. I am supportive of this change. > > Ismael > > On Mon, Feb 12, 2024 at 7:43 AM David Jacot > wrote: > > > Hi Bruno, > > > > Yes, you're right. Sorry for the typo. > > > > Hi Ismael, > > > > You're right. Jenkins does not support the flakyFailure element and > > hence the information is not at all in the Jenkins report. I am still > > experimenting with printing the flaky tests somewhere. I will update this > > thread if I get something working. In the meantime, I wanted to gauge > > whether there is support for it. > > > > Cheers, > > David > > > > On Mon, Feb 12, 2024 at 3:59 PM Ismael Juma wrote: > > > > > Hi David, > > > > > > Your message didn't make this clear, but you are saying that Jenkins > does > > > _not_ support the flakyFailure element and hence this information will > be > > > completely missing from the Jenkins report. Have we considered > including > > > the flakyFailure information ourselves? I have seen that being done and > > it > > > seems strictly better than totally ignoring it. > > > > > > Ismael > > > > > > On Mon, Feb 12, 2024 at 12:11 AM David Jacot > > > > > > wrote: > > > > > > > Hi folks, > > > > > > > > I have been playing with `reports.junitXml.mergeReruns` setting in > > gradle > > > > [1]. From the gradle doc: > > > > > > > > > When mergeReruns is enabled, if a test fails but is then retried > and > > > > succeeds, its failures will be recorded as instead of > > > > , within one . This is effectively the reporting > > > > produced by the surefire plugin of Apache Maven™ when enabling > reruns. > > If > > > > your CI server understands this format, it will indicate that the > test > > > was > > > > flaky. If it does not, it will indicate that the test succeeded as it > > > will > > > > ignore the information. If the test does not succeed > > (i.e. > > > > it fails for every retry), it will be indicated as having failed > > whether > > > > your tool understands this format or not. > > > > > > > > With this, we get really close to having green builds [2] all the > time. > > > > There are only a few tests which are too flaky. We should address or > > > > disable those. > > > > > > > > I think that this would help us a lot because it would reduce the > noise > > > > that we get in pull requests. At the moment, there are just too many > > > failed > > > > tests reported so it is really hard to know whether a pull request is > > > > actually fine or not. > > > > > > > > [1] applies it to both unit and integration tests. Following the > > > discussion > > > > in the `github build queue` thread, it may be better to only apply it > > to > > > > the integration tests. Being stricter with unit tests would make > sense. > > > > > > > > This does not mean that we should continue our effort to reduce the > > > number > > > > of flaky tests. For this, I propose to keep using Gradle Entreprise. > It > > > > provides a nice report for them that we can leverage. > > > > > > > > Thoughts? > > > > > > > > Best, > > > > David > > > > > > > > [1] https://github.com/apache/kafka/pull/14862 > > > > [2] > > > > > > > > > > > > > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests > > > > > > > > > >
Re: Improve flaky test reporting (KAFKA-12216)
Sounds good. I am supportive of this change. Ismael On Mon, Feb 12, 2024 at 7:43 AM David Jacot wrote: > Hi Bruno, > > Yes, you're right. Sorry for the typo. > > Hi Ismael, > > You're right. Jenkins does not support the flakyFailure element and > hence the information is not at all in the Jenkins report. I am still > experimenting with printing the flaky tests somewhere. I will update this > thread if I get something working. In the meantime, I wanted to gauge > whether there is support for it. > > Cheers, > David > > On Mon, Feb 12, 2024 at 3:59 PM Ismael Juma wrote: > > > Hi David, > > > > Your message didn't make this clear, but you are saying that Jenkins does > > _not_ support the flakyFailure element and hence this information will be > > completely missing from the Jenkins report. Have we considered including > > the flakyFailure information ourselves? I have seen that being done and > it > > seems strictly better than totally ignoring it. > > > > Ismael > > > > On Mon, Feb 12, 2024 at 12:11 AM David Jacot > > > wrote: > > > > > Hi folks, > > > > > > I have been playing with `reports.junitXml.mergeReruns` setting in > gradle > > > [1]. From the gradle doc: > > > > > > > When mergeReruns is enabled, if a test fails but is then retried and > > > succeeds, its failures will be recorded as instead of > > > , within one . This is effectively the reporting > > > produced by the surefire plugin of Apache Maven™ when enabling reruns. > If > > > your CI server understands this format, it will indicate that the test > > was > > > flaky. If it does not, it will indicate that the test succeeded as it > > will > > > ignore the information. If the test does not succeed > (i.e. > > > it fails for every retry), it will be indicated as having failed > whether > > > your tool understands this format or not. > > > > > > With this, we get really close to having green builds [2] all the time. > > > There are only a few tests which are too flaky. We should address or > > > disable those. > > > > > > I think that this would help us a lot because it would reduce the noise > > > that we get in pull requests. At the moment, there are just too many > > failed > > > tests reported so it is really hard to know whether a pull request is > > > actually fine or not. > > > > > > [1] applies it to both unit and integration tests. Following the > > discussion > > > in the `github build queue` thread, it may be better to only apply it > to > > > the integration tests. Being stricter with unit tests would make sense. > > > > > > This does not mean that we should continue our effort to reduce the > > number > > > of flaky tests. For this, I propose to keep using Gradle Entreprise. It > > > provides a nice report for them that we can leverage. > > > > > > Thoughts? > > > > > > Best, > > > David > > > > > > [1] https://github.com/apache/kafka/pull/14862 > > > [2] > > > > > > > > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests > > > > > >
Re: Improve flaky test reporting (KAFKA-12216)
Hi Bruno, Yes, you're right. Sorry for the typo. Hi Ismael, You're right. Jenkins does not support the flakyFailure element and hence the information is not at all in the Jenkins report. I am still experimenting with printing the flaky tests somewhere. I will update this thread if I get something working. In the meantime, I wanted to gauge whether there is support for it. Cheers, David On Mon, Feb 12, 2024 at 3:59 PM Ismael Juma wrote: > Hi David, > > Your message didn't make this clear, but you are saying that Jenkins does > _not_ support the flakyFailure element and hence this information will be > completely missing from the Jenkins report. Have we considered including > the flakyFailure information ourselves? I have seen that being done and it > seems strictly better than totally ignoring it. > > Ismael > > On Mon, Feb 12, 2024 at 12:11 AM David Jacot > wrote: > > > Hi folks, > > > > I have been playing with `reports.junitXml.mergeReruns` setting in gradle > > [1]. From the gradle doc: > > > > > When mergeReruns is enabled, if a test fails but is then retried and > > succeeds, its failures will be recorded as instead of > > , within one . This is effectively the reporting > > produced by the surefire plugin of Apache Maven™ when enabling reruns. If > > your CI server understands this format, it will indicate that the test > was > > flaky. If it does not, it will indicate that the test succeeded as it > will > > ignore the information. If the test does not succeed (i.e. > > it fails for every retry), it will be indicated as having failed whether > > your tool understands this format or not. > > > > With this, we get really close to having green builds [2] all the time. > > There are only a few tests which are too flaky. We should address or > > disable those. > > > > I think that this would help us a lot because it would reduce the noise > > that we get in pull requests. At the moment, there are just too many > failed > > tests reported so it is really hard to know whether a pull request is > > actually fine or not. > > > > [1] applies it to both unit and integration tests. Following the > discussion > > in the `github build queue` thread, it may be better to only apply it to > > the integration tests. Being stricter with unit tests would make sense. > > > > This does not mean that we should continue our effort to reduce the > number > > of flaky tests. For this, I propose to keep using Gradle Entreprise. It > > provides a nice report for them that we can leverage. > > > > Thoughts? > > > > Best, > > David > > > > [1] https://github.com/apache/kafka/pull/14862 > > [2] > > > > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests > > >
Re: Improve flaky test reporting (KAFKA-12216)
Hi David, Your message didn't make this clear, but you are saying that Jenkins does _not_ support the flakyFailure element and hence this information will be completely missing from the Jenkins report. Have we considered including the flakyFailure information ourselves? I have seen that being done and it seems strictly better than totally ignoring it. Ismael On Mon, Feb 12, 2024 at 12:11 AM David Jacot wrote: > Hi folks, > > I have been playing with `reports.junitXml.mergeReruns` setting in gradle > [1]. From the gradle doc: > > > When mergeReruns is enabled, if a test fails but is then retried and > succeeds, its failures will be recorded as instead of > , within one . This is effectively the reporting > produced by the surefire plugin of Apache Maven™ when enabling reruns. If > your CI server understands this format, it will indicate that the test was > flaky. If it does not, it will indicate that the test succeeded as it will > ignore the information. If the test does not succeed (i.e. > it fails for every retry), it will be indicated as having failed whether > your tool understands this format or not. > > With this, we get really close to having green builds [2] all the time. > There are only a few tests which are too flaky. We should address or > disable those. > > I think that this would help us a lot because it would reduce the noise > that we get in pull requests. At the moment, there are just too many failed > tests reported so it is really hard to know whether a pull request is > actually fine or not. > > [1] applies it to both unit and integration tests. Following the discussion > in the `github build queue` thread, it may be better to only apply it to > the integration tests. Being stricter with unit tests would make sense. > > This does not mean that we should continue our effort to reduce the number > of flaky tests. For this, I propose to keep using Gradle Entreprise. It > provides a nice report for them that we can leverage. > > Thoughts? > > Best, > David > > [1] https://github.com/apache/kafka/pull/14862 > [2] > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests >
Re: Improve flaky test reporting (KAFKA-12216)
Hi David, I guess you meant to say "This does not mean that we should NOT continue our effort to reduce the number of flaky tests." I totally agree with what you wrote. I am also +1 on considering all failures for unit tests. Best, Bruno On 2/12/24 9:11 AM, David Jacot wrote: Hi folks, I have been playing with `reports.junitXml.mergeReruns` setting in gradle [1]. From the gradle doc: When mergeReruns is enabled, if a test fails but is then retried and succeeds, its failures will be recorded as instead of , within one . This is effectively the reporting produced by the surefire plugin of Apache Maven™ when enabling reruns. If your CI server understands this format, it will indicate that the test was flaky. If it does not, it will indicate that the test succeeded as it will ignore the information. If the test does not succeed (i.e. it fails for every retry), it will be indicated as having failed whether your tool understands this format or not. With this, we get really close to having green builds [2] all the time. There are only a few tests which are too flaky. We should address or disable those. I think that this would help us a lot because it would reduce the noise that we get in pull requests. At the moment, there are just too many failed tests reported so it is really hard to know whether a pull request is actually fine or not. [1] applies it to both unit and integration tests. Following the discussion in the `github build queue` thread, it may be better to only apply it to the integration tests. Being stricter with unit tests would make sense. This does not mean that we should continue our effort to reduce the number of flaky tests. For this, I propose to keep using Gradle Entreprise. It provides a nice report for them that we can leverage. Thoughts? Best, David [1] https://github.com/apache/kafka/pull/14862 [2] https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests
Improve flaky test reporting (KAFKA-12216)
Hi folks, I have been playing with `reports.junitXml.mergeReruns` setting in gradle [1]. From the gradle doc: > When mergeReruns is enabled, if a test fails but is then retried and succeeds, its failures will be recorded as instead of , within one . This is effectively the reporting produced by the surefire plugin of Apache Maven™ when enabling reruns. If your CI server understands this format, it will indicate that the test was flaky. If it does not, it will indicate that the test succeeded as it will ignore the information. If the test does not succeed (i.e. it fails for every retry), it will be indicated as having failed whether your tool understands this format or not. With this, we get really close to having green builds [2] all the time. There are only a few tests which are too flaky. We should address or disable those. I think that this would help us a lot because it would reduce the noise that we get in pull requests. At the moment, there are just too many failed tests reported so it is really hard to know whether a pull request is actually fine or not. [1] applies it to both unit and integration tests. Following the discussion in the `github build queue` thread, it may be better to only apply it to the integration tests. Being stricter with unit tests would make sense. This does not mean that we should continue our effort to reduce the number of flaky tests. For this, I propose to keep using Gradle Entreprise. It provides a nice report for them that we can leverage. Thoughts? Best, David [1] https://github.com/apache/kafka/pull/14862 [2] https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests