Hi David, Thanks for bringing this point and also for creating the revert PRs. I think there has been an effort in the community to fix a lot of flakey tests(like around MirrorMaker). I also agree that we shouldn't merge PRs without green builds and look to ignore flaky tests. For example, I did a quick search for some of the common (and sometimes longstanding flakey tests) and this is a brief list. Some of them have JIRAs associated with them as well =>
1) org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated => https://issues.apache.org/jira/browse/KAFKA-8115 2) kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize => https://issues.apache.org/jira/browse/KAFKA-15421 3) org.apache.kafka.controller.QuorumControllerTest.testFenceMultipleBrokers (Couldn't find JIRA). 4) integration.kafka.server.FetchFromFollowerIntegrationTest.testRackAwareRangeAssignor => https://issues.apache.org/jira/browse/KAFKA-15020 5) EosIntegrationTest => https://issues.apache.org/jira/browse/KAFKA-15690 The only thing is where do we draw the line of disabling a genuine flakey test v/s looking to fix it. I feel that could get confusing at times especially if the flakey test involved is on an unrelated part of code (like a flaky Connect test on Group Coordinator or Streams). Thanks! Sagar. On Sat, Nov 11, 2023 at 3:31 PM David Jacot <dja...@confluent.io.invalid> wrote: > Hi all, > > The state of our CI worries me a lot. Just this week, we merged two PRs > with compilation errors and one PR introducing persistent failures. This > really hurts the quality and the velocity of the project and it basically > defeats the purpose of having a CI because we tend to ignore it nowadays. > > Should we continue to merge without a green build? No! We should not so I > propose to prevent merging a pull request without a green build. This is a > really simple and bold move that will prevent us from introducing > regressions and will improve the overall health of the project. At the same > time, I think that we should disable all the known flaky tests, raise jiras > for them, find an owner for each of them, and fix them. > > What do you think? > > Best, > David >