>  Please, do it.
We can use specific labels to effectively filter those tickets.

We already have a label and a way to discover flaky tests. They are tagged
with the label "flaky-test" [1]. There is also a label "newbie" [2] meant
for folks who are new to Apache Kafka code base.
My suggestion is to send a broader email to the community (since many will
miss details in this thread) and call for action for committers to
volunteer as "shepherds" for these tickets. I can send one out once we have
some consensus wrt next steps in this thread.


[1]
https://issues.apache.org/jira/browse/KAFKA-13421?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22)%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20flaky-test%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC


[2] https://kafka.apache.org/contributing -> Finding a project to work on


Divij Vaidya



On Mon, Nov 13, 2023 at 4:24 PM Николай Ижиков <nizhi...@apache.org> wrote:

>
> > To kickstart this effort, we can publish a list of such tickets in the
> community and assign one or more committers the role of a «shepherd" for
> each ticket.
>
> Please, do it.
> We can use specific label to effectively filter those tickets.
>
> > 13 нояб. 2023 г., в 15:16, Divij Vaidya <divijvaidy...@gmail.com>
> написал(а):
> >
> > Thanks for bringing this up David.
> >
> > My primary concern revolves around the possibility that the currently
> > disabled tests may remain inactive indefinitely. We currently have
> > unresolved JIRA tickets for flaky tests that have been pending for an
> > extended period. I am inclined to support the idea of disabling these
> tests
> > temporarily and merging changes only when the build is successful,
> provided
> > there is a clear plan for re-enabling them in the future.
> >
> > To address this issue, I propose the following measures:
> >
> > 1\ Foster a supportive environment for new contributors within the
> > community, encouraging them to take on tickets associated with flaky
> tests.
> > This initiative would require individuals familiar with the relevant code
> > to offer guidance to those undertaking these tasks. Committers should
> > prioritize reviewing and addressing these tickets within their available
> > bandwidth. To kickstart this effort, we can publish a list of such
> tickets
> > in the community and assign one or more committers the role of a
> "shepherd"
> > for each ticket.
> >
> > 2\ Implement a policy to block minor version releases until the Release
> > Manager (RM) is satisfied that the disabled tests do not result in gaps
> in
> > our testing coverage. The RM may rely on Subject Matter Experts (SMEs) in
> > the specific code areas to provide assurance before giving the green
> light
> > for a release.
> >
> > 3\ Set a community-wide goal for 2024 to achieve a stable Continuous
> > Integration (CI) system. This goal should encompass projects such as
> > refining our test suite to eliminate flakiness and addressing
> > infrastructure issues if necessary. By publishing this goal, we create a
> > shared vision for the community in 2024, fostering alignment on our
> > objectives. This alignment will aid in prioritizing tasks for community
> > members and guide reviewers in allocating their bandwidth effectively.
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Sun, Nov 12, 2023 at 2:58 AM Justine Olshan
> <jols...@confluent.io.invalid>
> > wrote:
> >
> >> I will say that I have also seen tests that seem to be more flaky
> >> intermittently. It may be ok for some time and suddenly the CI is
> >> overloaded and we see issues.
> >> I have also seen the CI struggling with running out of space recently,
> so I
> >> wonder if we can also try to improve things on that front.
> >>
> >> FWIW, I noticed, filed, or commented on several flaky test JIRAs last
> week.
> >> I'm happy to try to get to green builds, but everyone needs to be on
> board.
> >>
> >> https://issues.apache.org/jira/browse/KAFKA-15529
> >> https://issues.apache.org/jira/browse/KAFKA-14806
> >> https://issues.apache.org/jira/browse/KAFKA-14249
> >> https://issues.apache.org/jira/browse/KAFKA-15798
> >> https://issues.apache.org/jira/browse/KAFKA-15797
> >> https://issues.apache.org/jira/browse/KAFKA-15690
> >> https://issues.apache.org/jira/browse/KAFKA-15699
> >> https://issues.apache.org/jira/browse/KAFKA-15772
> >> https://issues.apache.org/jira/browse/KAFKA-15759
> >> https://issues.apache.org/jira/browse/KAFKA-15760
> >> https://issues.apache.org/jira/browse/KAFKA-15700
> >>
> >> I've also seen that kraft transactions tests often flakily see that the
> >> producer id is not allocated and times out.
> >> I can file a JIRA for that too.
> >>
> >> Hopefully this is a place we can start from.
> >>
> >> Justine
> >>
> >> On Sat, Nov 11, 2023 at 11:35 AM Ismael Juma <m...@ismaeljuma.com> wrote:
> >>
> >>> On Sat, Nov 11, 2023 at 10:32 AM John Roesler <vvcep...@apache.org>
> >> wrote:
> >>>
> >>>> In other words, I’m biased to think that new flakiness indicates
> >>>> non-deterministic bugs more often than it indicates a bad test.
> >>>>
> >>>
> >>> My experience is exactly the opposite. As someone who has tracked many
> of
> >>> the flaky fixes, the vast majority of the time they are an issue with
> the
> >>> test.
> >>>
> >>> Ismael
> >>>
> >>
>
>

Reply via email to