Sophie, > are tests automatically sorted into these buckets or do we have to manually move them
This part hasn't changed -- we still need to manually mark tests as @Flaky when they are flaky. The recent change has just separated out the flaky tests into a parallel job rather than a sequential step. It also added the "new" tests as a separate job rather than baked into the now-removed quarantinedTest step. New tests are determined based on the test-catalog branch https://github.com/apache/kafka/tree/test-catalog On Tue, Feb 25, 2025 at 9:41 PM Sophie Blee-Goldman <sop...@responsive.dev> wrote: > Thanks David! This is awesome, really glad to see this effort to reduce > test flakiness. > > One question -- are tests automatically sorted into these buckets or do we > have to manually move them? And if so, how does that work (eg a test in > "main" becomes flaky) > > On Tue, Feb 25, 2025 at 3:20 PM David Arthur <mum...@gmail.com> wrote: > > > > Can we merge the PR if only flaky or new tests fail? > > > > I agree with Ismael that new tests must be solid -- no flakiness should > be > > expected when adding a test. Obviously, we will miss things, so we have > to > > tolerate them on trunk (along with environmental flaky factors). > > > > If there are existing *unrelated* tests that are flaky on a PR, that is > > fine. Ideally, each failing test or flaky tests on a PR should be > > investigated. > > > > For PRs: > > "new" tests -- no flakiness > > "flaky" tests -- expect some flakiness, still look at these failures to > > make sure the PR didn't make it worse > > "main" tests -- normal amounts of flakiness, still look at these failures > > to make sure the PR didn't make it worse (and file a Jira to report > flaky, > > if applicable) > > > > --- > > > > > > BTW If your PRs have failed build scans, try merging in trunk. > > > > [image: image.png] > > > > -David A > > > > > > On Mon, Feb 24, 2025 at 7:16 PM Ismael Juma <m...@ismaeljuma.com> wrote: > > > >> > > >> > Can we merge the PR if only flaky or new tests fail? > >> > >> > >> We certainly cannot merge if new tests fail - the goal is to ensure new > >> tests are solid. > >> > >> For the flaky ones, I'm not sure how we intend to use these. I would > >> prefer > >> if we only merge when the PR status is green. Otherwise, we often end up > >> merging things we shouldn't (by accident). > >> > >> Ismael > >> > >> On Mon, Feb 24, 2025 at 2:50 PM Chia-Ping Tsai <chia7...@gmail.com> > >> wrote: > >> > >> > hi David > >> > > >> > Thanks for all your improvement. I do love the new test suites! > >> > > >> > one small question: > >> > Can we merge the PR if only flaky or new tests fail? Sometimes, I list > >> > tickets for flaky (or unrelated) tests before merging. However, since > we > >> > now have a separate test suite for stable tests (non-flaky, non-new), > I > >> > assume the new condition is that "stable tests must pass"? > >> > > >> > Best, > >> > Chia-Ping > >> > > >> > > >> > > >> > > >> > Ismael Juma <m...@ismaeljuma.com> 於 2025年2月25日 週二 上午6:24寫道: > >> > > >> > > Thanks David - this is another important improvement to our CI > >> pipeline > >> > and > >> > > is super helpful for the project and community. > >> > > > >> > > Ismael > >> > > > >> > > On Mon, Feb 24, 2025 at 2:15 PM David Arthur <mum...@gmail.com> > >> wrote: > >> > > > >> > > > Hey everyone, just wanted to inform you all that we just merged > >> > > KAFKA-18748 > >> > > > > >> > > > https://github.com/apache/kafka/pull/18770 > >> > > > > >> > > > This splits our CI workflow into more parallel jobs which run > >> subsets > >> > of > >> > > > the tests with different settings. The JUnit tests are now split > >> into > >> > > > "new", "flaky", and the remainder. > >> > > > > >> > > > "New" tests are what we previously called auto-quarantined tests. > >> > > > > >> > > > On PR builds, "new" tests are anything that do not exist on trunk. > >> They > >> > > are > >> > > > run with zero tolerance for flakiness. > >> > > > > >> > > > On trunk builds, "new" tests are anything added in the last 7 > days. > >> > They > >> > > > are run with some tolerance for flakiness. > >> > > > > >> > > > The point of this is to discourage flaky tests from being added to > >> > trunk. > >> > > > > >> > > > Please update your PRs with trunk and let me know if you see any > >> > > weirdness. > >> > > > Feel free to tag me in the PR, reply to this thread, or email me > >> > directly > >> > > > with questions. > >> > > > > >> > > > Thanks! > >> > > > David A > >> > > > > >> > > > >> > > >> > > > > > > -- > > David Arthur > > > -- David Arthur