Hi Martijn, Zakelly, and everyone, +1 to adding AGENTS.md. It's a great first step as all other Apache projects follow the same approach.
I saw this thread and thought I'd chime in because I'm actually working on a draft KIP proposal on this exact topic right now. To Zakelly's point about AI falling short on architecture: AGENTS.md is a great guide, but it’s ultimately a "soft control." In my experience, LLMs probabilistically ignore markdown instructions when their context windows fill up or prompts drift. To really stop the review fatigue, my KIP draft proposes adding a deterministic "hard control" hooked directly into the build system. It uses local AST parsing to automatically block PRs that are mostly empty scaffolding/docstrings (low logic density) or violate core architectural patterns. It catches the "AI slop" before a human ever has to look at it. If the community is interested, I’d be happy to share my draft KIP. It might be a helpful reference if we want to explore a similar Maven-based gate for Flink. Regards, Vaquar Khan On Thu, Mar 12, 2026 at 9:57 PM Zakelly Lan <[email protected]> wrote: > Hi, Martjin, > > Thanks for bringing this up. I'd +1 on this proposal. > > In the guidelines, I'd like to emphasize that contributors and reviewers > should pay particular attention to architecture, performance, and code > reusability. Based on my experience working with AI, code agents often fall > short in these. > > And furthermore, I suggest we introduce mechanisms to ensure a smooth > review process for AI-generated code, such as adding github labels and a > special reminder for reviewers from the flink's github bot. > > > Best, > Zakelly > > > On Fri, Mar 13, 2026 at 10:09 AM Rion Williams <[email protected]> > wrote: > > > Hi Martijn, > > > > I think this is a great idea and definitely an effort worth pursuing — > > it’s actually something I’ve been considering experimenting with myself. > A > > clear +1 from me, and I’d be happy to help as the effort develops. > > > > On the reviewer side, we already have a pretty solid set of guardrails > and > > review processes in place, which is great. That said, it’s still easy to > > become inundated by a large, random PR with little or no context > (sometimes > > clearly AI-driven). Establishing some guidelines specifically around AI > > usage — both for providing development context and for helping with the > > review/audit process — would be fantastic, even if we start small and > > gradually evolve things over time. > > > > Thanks for kicking this off. Looking forward to hearing what others > think. > > > > Cheers, > > > > Rion > > > > > > > On Mar 12, 2026, at 8:50 PM, Leonard Xu <[email protected]> wrote: > > > > > > Hi Martijn, > > > > > > Thanks for kicking off this discussion. I've been thinking along > similar > > lines recently, so you have a +1 from me on this proposal. > > > > > > I also have a suggestion regarding activity on the users' mailing list. > > Could we consider introducing an AI agent to help answer users' > questions? > > I've noticed that many inquiries on user@flink currently go unanswered, > > yet most of them could be effectively addressed by an agent. > > > > > > > > > Best, > > > Leonard > > > > > >> 2026 3月 13 05:03,Martijn Visser <[email protected]> 写道: > > >> > > >> Hi all, > > >> > > >> I'd like to start a discussion about how the Flink community should > > handle > > >> AI-assisted contributions and how we can make the Flink codebase more > > >> accessible to AI tooling. > > >> > > >> The ASF has published guidance on generative AI tooling [1], and > several > > >> Apache projects have already adopted project-specific guidelines on > top > > of > > >> that. I think Flink should too. > > >> > > >> The most comprehensive example I've seen is Apache Airflow. They've > > added > > >> an AGENTS.md [2] with instructions for AI coding agents, including PR > > >> templates with an AI disclosure checkbox, a self-review checklist, and > > the > > >> Generated-by: commit message token that the ASF guidance recommends. > > Apache > > >> Iceberg recently adopted AI contribution guidelines [3] focused on > > >> contributor accountability: you must be able to debug, explain, and > own > > the > > >> changes. Other projects like Paimon [4], Mahout [5], and Ozone [6] > have > > >> adopted similar policies. > > >> > > >> I'd like to propose the following for Flink: > > >> > > >> 1. Adopt contribution guidelines for AI-assisted PRs. Contributors > must > > >> disclose when AI tooling was used (using Generated-by: <Tool Name and > > >> Version> in the commit message), and must be able to explain and take > > >> ownership of all changes. AI-generated code is held to the same review > > >> standards as human-written code. > > >> 2. Add AGENTS.md files to the Flink repository. AGENTS.md [7] is a > > >> convention for giving AI coding agents project-specific context. It > can > > >> contain information like build instructions, test commands, coding > > >> conventions, commit message format. I think we should add one at the > > root > > >> of apache/flink. > > >> 3. Add module-level context for AI tooling. This is where I think we > can > > >> take a step forward. Each Flink module (e.g. flink-streaming-java, > > >> flink-table-planner, flink-clients) would benefit from its own > AGENTS.md > > >> explaining the module's role, key abstractions, testing patterns, and > > >> common pitfalls. This also serves as architectural documentation that > > helps > > >> human contributors. > > >> > > >> I'm looking forward to hearing what others think about this. > > >> > > >> Best regards, > > >> > > >> Martijn > > >> > > >> [1] https://www.apache.org/legal/generative-tooling.html > > >> [2] https://github.com/apache/airflow/blob/main/AGENTS.md > > >> [3] > > >> > > > https://iceberg.apache.org/contribute/#guidelines-for-ai-assisted-contributions > > >> [4] > > >> > > > https://github.com/apache/paimon/blob/master/.github/PULL_REQUEST_TEMPLATE.md?plain=1#L22 > > >> [5] > > >> > > > https://github.com/apache/mahout/blob/main/docs/community/pr-policy-and-review-guidelines.md > > >> [6] > > >> > > > https://github.com/apache/ozone-site/blob/master/src/pages/release-notes/2.0.0.md?plain=1#L408 > > >> [7] https://agents.md/ > > > > > >
