Re: [DISCUSS] AI-Assisted Contributions and AI Tooling Support in Apache Flink

vaquar khan Fri, 13 Mar 2026 22:13:30 -0700

Hi Martijn, Zakelly, and everyone,

+1 to adding AGENTS.md. It's a great first step
as all other Apache projects follow the same approach.


I saw this thread and thought I'd chime in because I'm actually working on
a draft KIP proposal  on this exact topic right now.

To Zakelly's point about AI falling short on architecture: AGENTS.md is a
great guide, but it’s ultimately a "soft control." In my experience, LLMs
probabilistically ignore markdown instructions when their context windows
fill up or prompts drift.

To really stop the review fatigue, my KIP draft proposes adding a
deterministic "hard control" hooked directly into the build system. It uses
local AST parsing to automatically block PRs that are mostly empty
scaffolding/docstrings (low logic density) or violate core architectural
patterns. It catches the "AI slop" before a human ever has to look at it.

If the community is interested, I’d be happy to share my draft KIP. It
might be a helpful reference if we want to explore a similar Maven-based
gate for Flink.

Regards,

Vaquar Khan

On Thu, Mar 12, 2026 at 9:57 PM Zakelly Lan <[email protected]> wrote:

> Hi, Martjin,
>
> Thanks for bringing this up. I'd +1 on this proposal.
>
> In the guidelines, I'd like to emphasize that contributors and reviewers
> should pay particular attention to architecture, performance, and code
> reusability. Based on my experience working with AI, code agents often fall
> short in these.
>
> And furthermore, I suggest we introduce mechanisms to ensure a smooth
> review process for AI-generated code, such as adding github labels and a
> special reminder for reviewers from the flink's github bot.
>
>
> Best,
> Zakelly
>
>
> On Fri, Mar 13, 2026 at 10:09 AM Rion Williams <[email protected]>
> wrote:
>
> > Hi Martijn,
> >
> > I think this is a great idea and definitely an effort worth pursuing —
> > it’s actually something I’ve been considering experimenting with myself.
> A
> > clear +1 from me, and I’d be happy to help as the effort develops.
> >
> > On the reviewer side, we already have a pretty solid set of guardrails
> and
> > review processes in place, which is great. That said, it’s still easy to
> > become inundated by a large, random PR with little or no context
> (sometimes
> > clearly AI-driven). Establishing some guidelines specifically around AI
> > usage — both for providing development context and for helping with the
> > review/audit process — would be fantastic, even if we start small and
> > gradually evolve things over time.
> >
> > Thanks for kicking this off. Looking forward to hearing what others
> think.
> >
> > Cheers,
> >
> > Rion
> >
> >
> > > On Mar 12, 2026, at 8:50 PM, Leonard Xu <[email protected]> wrote:
> > >
> > > Hi Martijn,
> > >
> > > Thanks for kicking off this discussion. I've been thinking along
> similar
> > lines recently, so you have a +1 from me on this proposal.
> > >
> > > I also have a suggestion regarding activity on the users' mailing list.
> > Could we consider introducing an AI agent to help answer users'
> questions?
> > I've noticed that many inquiries on user@flink currently go unanswered,
> > yet most of them could be effectively addressed by an agent.
> > >
> > >
> > > Best,
> > > Leonard
> > >
> > >> 2026 3月 13 05:03，Martijn Visser <[email protected]> 写道：
> > >>
> > >> Hi all,
> > >>
> > >> I'd like to start a discussion about how the Flink community should
> > handle
> > >> AI-assisted contributions and how we can make the Flink codebase more
> > >> accessible to AI tooling.
> > >>
> > >> The ASF has published guidance on generative AI tooling [1], and
> several
> > >> Apache projects have already adopted project-specific guidelines on
> top
> > of
> > >> that. I think Flink should too.
> > >>
> > >> The most comprehensive example I've seen is Apache Airflow. They've
> > added
> > >> an AGENTS.md [2] with instructions for AI coding agents, including PR
> > >> templates with an AI disclosure checkbox, a self-review checklist, and
> > the
> > >> Generated-by: commit message token that the ASF guidance recommends.
> > Apache
> > >> Iceberg recently adopted AI contribution guidelines [3] focused on
> > >> contributor accountability: you must be able to debug, explain, and
> own
> > the
> > >> changes. Other projects like Paimon [4], Mahout [5], and Ozone [6]
> have
> > >> adopted similar policies.
> > >>
> > >> I'd like to propose the following for Flink:
> > >>
> > >> 1. Adopt contribution guidelines for AI-assisted PRs. Contributors
> must
> > >> disclose when AI tooling was used (using Generated-by: <Tool Name and
> > >> Version> in the commit message), and must be able to explain and take
> > >> ownership of all changes. AI-generated code is held to the same review
> > >> standards as human-written code.
> > >> 2. Add AGENTS.md files to the Flink repository. AGENTS.md [7] is a
> > >> convention for giving AI coding agents project-specific context. It
> can
> > >> contain information like build instructions, test commands, coding
> > >> conventions, commit message format. I think we should add one at the
> > root
> > >> of apache/flink.
> > >> 3. Add module-level context for AI tooling. This is where I think we
> can
> > >> take a step forward. Each Flink module (e.g. flink-streaming-java,
> > >> flink-table-planner, flink-clients) would benefit from its own
> AGENTS.md
> > >> explaining the module's role, key abstractions, testing patterns, and
> > >> common pitfalls. This also serves as architectural documentation that
> > helps
> > >> human contributors.
> > >>
> > >> I'm looking forward to hearing what others think about this.
> > >>
> > >> Best regards,
> > >>
> > >> Martijn
> > >>
> > >> [1] https://www.apache.org/legal/generative-tooling.html
> > >> [2] https://github.com/apache/airflow/blob/main/AGENTS.md
> > >> [3]
> > >>
> >
> https://iceberg.apache.org/contribute/#guidelines-for-ai-assisted-contributions
> > >> [4]
> > >>
> >
> https://github.com/apache/paimon/blob/master/.github/PULL_REQUEST_TEMPLATE.md?plain=1#L22
> > >> [5]
> > >>
> >
> https://github.com/apache/mahout/blob/main/docs/community/pr-policy-and-review-guidelines.md
> > >> [6]
> > >>
> >
> https://github.com/apache/ozone-site/blob/master/src/pages/release-notes/2.0.0.md?plain=1#L408
> > >> [7] https://agents.md/
> > >
> >
>

Re: [DISCUSS] AI-Assisted Contributions and AI Tooling Support in Apache Flink

Reply via email to