Sophie,

> are tests automatically sorted into these buckets or do we have to
manually move them

This part hasn't changed -- we still need to manually mark tests as @Flaky
when they are flaky. The recent change has just separated out the flaky
tests into a parallel job rather than a sequential step. It also added the
"new" tests as a separate job rather than baked into the now-removed
quarantinedTest step.

New tests are determined based on the test-catalog branch
https://github.com/apache/kafka/tree/test-catalog



On Tue, Feb 25, 2025 at 9:41 PM Sophie Blee-Goldman <sop...@responsive.dev>
wrote:

> Thanks David! This is awesome, really glad to see this effort to reduce
> test flakiness.
>
> One question -- are tests automatically sorted into these buckets or do we
> have to manually move them? And if so, how does that work (eg a test in
> "main" becomes flaky)
>
> On Tue, Feb 25, 2025 at 3:20 PM David Arthur <mum...@gmail.com> wrote:
>
> > > Can we merge the PR if only flaky or new tests fail?
> >
> > I agree with Ismael that new tests must be solid -- no flakiness should
> be
> > expected when adding a test. Obviously, we will miss things, so we have
> to
> > tolerate them on trunk (along with environmental flaky factors).
> >
> > If there are existing *unrelated* tests that are flaky on a PR, that is
> > fine. Ideally, each failing test or flaky tests on a PR should be
> > investigated.
> >
> > For PRs:
> > "new" tests -- no flakiness
> > "flaky" tests -- expect some flakiness, still look at these failures to
> > make sure the PR didn't make it worse
> > "main" tests -- normal amounts of flakiness, still look at these failures
> > to make sure the PR didn't make it worse (and file a Jira to report
> flaky,
> > if applicable)
> >
> > ---
> >
> >
> > BTW If your PRs have failed build scans, try merging in trunk.
> >
> > [image: image.png]
> >
> > -David A
> >
> >
> > On Mon, Feb 24, 2025 at 7:16 PM Ismael Juma <m...@ismaeljuma.com> wrote:
> >
> >> >
> >> > Can we merge the PR if only flaky or new tests fail?
> >>
> >>
> >> We certainly cannot merge if new tests fail - the goal is to ensure new
> >> tests are solid.
> >>
> >> For the flaky ones, I'm not sure how we intend to use these. I would
> >> prefer
> >> if we only merge when the PR status is green. Otherwise, we often end up
> >> merging things we shouldn't (by accident).
> >>
> >> Ismael
> >>
> >> On Mon, Feb 24, 2025 at 2:50 PM Chia-Ping Tsai <chia7...@gmail.com>
> >> wrote:
> >>
> >> > hi David
> >> >
> >> > Thanks for all your improvement. I do love the new test suites!
> >> >
> >> > one small question:
> >> > Can we merge the PR if only flaky or new tests fail? Sometimes, I list
> >> > tickets for flaky (or unrelated) tests before merging. However, since
> we
> >> > now have a separate test suite for stable tests (non-flaky, non-new),
> I
> >> > assume the new condition is that "stable tests must pass"?
> >> >
> >> > Best,
> >> > Chia-Ping
> >> >
> >> >
> >> >
> >> >
> >> > Ismael Juma <m...@ismaeljuma.com> 於 2025年2月25日 週二 上午6:24寫道:
> >> >
> >> > > Thanks David - this is another important improvement to our CI
> >> pipeline
> >> > and
> >> > > is super helpful for the project and community.
> >> > >
> >> > > Ismael
> >> > >
> >> > > On Mon, Feb 24, 2025 at 2:15 PM David Arthur <mum...@gmail.com>
> >> wrote:
> >> > >
> >> > > > Hey everyone, just wanted to inform you all that we just merged
> >> > > KAFKA-18748
> >> > > >
> >> > > > https://github.com/apache/kafka/pull/18770
> >> > > >
> >> > > > This splits our CI workflow into more parallel jobs which run
> >> subsets
> >> > of
> >> > > > the tests with different settings. The JUnit tests are now split
> >> into
> >> > > > "new", "flaky", and the remainder.
> >> > > >
> >> > > > "New" tests are what we previously called auto-quarantined tests.
> >> > > >
> >> > > > On PR builds, "new" tests are anything that do not exist on trunk.
> >> They
> >> > > are
> >> > > > run with zero tolerance for flakiness.
> >> > > >
> >> > > > On trunk builds, "new" tests are anything added in the last 7
> days.
> >> > They
> >> > > > are run with some tolerance for flakiness.
> >> > > >
> >> > > > The point of this is to discourage flaky tests from being added to
> >> > trunk.
> >> > > >
> >> > > > Please update your PRs with trunk and let me know if you see any
> >> > > weirdness.
> >> > > > Feel free to tag me in the PR, reply to this thread, or email me
> >> > directly
> >> > > > with questions.
> >> > > >
> >> > > > Thanks!
> >> > > > David A
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> > --
> > David Arthur
> >
>


-- 
David Arthur

Reply via email to