If there are no more comments, I'll start a vote later this week On Mon, Mar 16, 2026 at 1:22 PM Martijn Visser <[email protected]> wrote:
> Hi all, > > Thanks for all the feedback and support. I've opened a draft PR [1] that > covers points 1 and 2 from the original proposal. > > What's in the PR: > > 1. The PR includes an AGENTS.md at the repository root with prerequisites, > build/test commands, repository structure, architecture boundaries, common > change patterns, coding standards, testing standards, commit conventions, > and boundaries. It also updates the PR template with a dedicated AI > disclosure section (checkbox + Generated-by tag). > 2. Module-level AGENTS.md files (point 3) are not (yet) included and can > be added incrementally by module maintainers. > > I've used Claude to generate this PR, to show how these tools can also > help us with these things. > > Let me also respond to the individual points raised. > > @Leonard: Interesting idea about an AI agent for the users' mailing list, > but I'd think it would also be great if we could integrate it in the Slack > workspace itself for those that are more active there. I think that's a > separate discussion worth having, but out of scope for this proposal. Would > you like to start a dedicated thread for that? > > @Zakelly: Good point about architecture, performance, and code > reusability. The AGENTS.md includes an "Architecture Boundaries" section > and a "Common Change Patterns" section that maps change types to the > modules they affect, which should help steer AI agents in the right > direction. Regarding GitHub labels and bot reminders for AI-generated PRs: > I think that's a good idea but would be a separate follow-up. I think we > should get the baseline guidelines in place first. > > @Vaquar: Thanks for sharing. I think AGENTS.md and the PR template > disclosure are the right starting point for Flink. Deterministic > build-system gates are an interesting idea, but I'd want to see how the > community's experience with AI contributions evolves before adding that > level of enforcement. If you'd like to propose something concrete for > Flink, a FLIP would be the right vehicle for that. > > Process question: > > Since these are contribution guidelines rather than API or architecture > changes, I think a vote on this thread would be sufficient. But if the > community feels this warrants a formal FLIP, I'm happy to go that route. > What do others think? > > Feedback on the PR is welcome. > > Thanks, Martijn > > [1] https://github.com/apache/flink/pull/27776 > > On Sat, Mar 14, 2026 at 6:13 AM vaquar khan <[email protected]> > wrote: > >> Hi Martijn, Zakelly, and everyone, >> >> +1 to adding AGENTS.md. It's a great first step >> as all other Apache projects follow the same approach. >> >> I saw this thread and thought I'd chime in because I'm actually working on >> a draft KIP proposal on this exact topic right now. >> >> To Zakelly's point about AI falling short on architecture: AGENTS.md is a >> great guide, but it’s ultimately a "soft control." In my experience, LLMs >> probabilistically ignore markdown instructions when their context windows >> fill up or prompts drift. >> >> To really stop the review fatigue, my KIP draft proposes adding a >> deterministic "hard control" hooked directly into the build system. It >> uses >> local AST parsing to automatically block PRs that are mostly empty >> scaffolding/docstrings (low logic density) or violate core architectural >> patterns. It catches the "AI slop" before a human ever has to look at it. >> >> If the community is interested, I’d be happy to share my draft KIP. It >> might be a helpful reference if we want to explore a similar Maven-based >> gate for Flink. >> >> Regards, >> >> Vaquar Khan >> >> On Thu, Mar 12, 2026 at 9:57 PM Zakelly Lan <[email protected]> >> wrote: >> >> > Hi, Martjin, >> > >> > Thanks for bringing this up. I'd +1 on this proposal. >> > >> > In the guidelines, I'd like to emphasize that contributors and reviewers >> > should pay particular attention to architecture, performance, and code >> > reusability. Based on my experience working with AI, code agents often >> fall >> > short in these. >> > >> > And furthermore, I suggest we introduce mechanisms to ensure a smooth >> > review process for AI-generated code, such as adding github labels and a >> > special reminder for reviewers from the flink's github bot. >> > >> > >> > Best, >> > Zakelly >> > >> > >> > On Fri, Mar 13, 2026 at 10:09 AM Rion Williams <[email protected]> >> > wrote: >> > >> > > Hi Martijn, >> > > >> > > I think this is a great idea and definitely an effort worth pursuing — >> > > it’s actually something I’ve been considering experimenting with >> myself. >> > A >> > > clear +1 from me, and I’d be happy to help as the effort develops. >> > > >> > > On the reviewer side, we already have a pretty solid set of guardrails >> > and >> > > review processes in place, which is great. That said, it’s still easy >> to >> > > become inundated by a large, random PR with little or no context >> > (sometimes >> > > clearly AI-driven). Establishing some guidelines specifically around >> AI >> > > usage — both for providing development context and for helping with >> the >> > > review/audit process — would be fantastic, even if we start small and >> > > gradually evolve things over time. >> > > >> > > Thanks for kicking this off. Looking forward to hearing what others >> > think. >> > > >> > > Cheers, >> > > >> > > Rion >> > > >> > > >> > > > On Mar 12, 2026, at 8:50 PM, Leonard Xu <[email protected]> wrote: >> > > > >> > > > Hi Martijn, >> > > > >> > > > Thanks for kicking off this discussion. I've been thinking along >> > similar >> > > lines recently, so you have a +1 from me on this proposal. >> > > > >> > > > I also have a suggestion regarding activity on the users' mailing >> list. >> > > Could we consider introducing an AI agent to help answer users' >> > questions? >> > > I've noticed that many inquiries on user@flink currently go >> unanswered, >> > > yet most of them could be effectively addressed by an agent. >> > > > >> > > > >> > > > Best, >> > > > Leonard >> > > > >> > > >> 2026 3月 13 05:03,Martijn Visser <[email protected]> 写道: >> > > >> >> > > >> Hi all, >> > > >> >> > > >> I'd like to start a discussion about how the Flink community should >> > > handle >> > > >> AI-assisted contributions and how we can make the Flink codebase >> more >> > > >> accessible to AI tooling. >> > > >> >> > > >> The ASF has published guidance on generative AI tooling [1], and >> > several >> > > >> Apache projects have already adopted project-specific guidelines on >> > top >> > > of >> > > >> that. I think Flink should too. >> > > >> >> > > >> The most comprehensive example I've seen is Apache Airflow. They've >> > > added >> > > >> an AGENTS.md [2] with instructions for AI coding agents, including >> PR >> > > >> templates with an AI disclosure checkbox, a self-review checklist, >> and >> > > the >> > > >> Generated-by: commit message token that the ASF guidance >> recommends. >> > > Apache >> > > >> Iceberg recently adopted AI contribution guidelines [3] focused on >> > > >> contributor accountability: you must be able to debug, explain, and >> > own >> > > the >> > > >> changes. Other projects like Paimon [4], Mahout [5], and Ozone [6] >> > have >> > > >> adopted similar policies. >> > > >> >> > > >> I'd like to propose the following for Flink: >> > > >> >> > > >> 1. Adopt contribution guidelines for AI-assisted PRs. Contributors >> > must >> > > >> disclose when AI tooling was used (using Generated-by: <Tool Name >> and >> > > >> Version> in the commit message), and must be able to explain and >> take >> > > >> ownership of all changes. AI-generated code is held to the same >> review >> > > >> standards as human-written code. >> > > >> 2. Add AGENTS.md files to the Flink repository. AGENTS.md [7] is a >> > > >> convention for giving AI coding agents project-specific context. It >> > can >> > > >> contain information like build instructions, test commands, coding >> > > >> conventions, commit message format. I think we should add one at >> the >> > > root >> > > >> of apache/flink. >> > > >> 3. Add module-level context for AI tooling. This is where I think >> we >> > can >> > > >> take a step forward. Each Flink module (e.g. flink-streaming-java, >> > > >> flink-table-planner, flink-clients) would benefit from its own >> > AGENTS.md >> > > >> explaining the module's role, key abstractions, testing patterns, >> and >> > > >> common pitfalls. This also serves as architectural documentation >> that >> > > helps >> > > >> human contributors. >> > > >> >> > > >> I'm looking forward to hearing what others think about this. >> > > >> >> > > >> Best regards, >> > > >> >> > > >> Martijn >> > > >> >> > > >> [1] https://www.apache.org/legal/generative-tooling.html >> > > >> [2] https://github.com/apache/airflow/blob/main/AGENTS.md >> > > >> [3] >> > > >> >> > > >> > >> https://iceberg.apache.org/contribute/#guidelines-for-ai-assisted-contributions >> > > >> [4] >> > > >> >> > > >> > >> https://github.com/apache/paimon/blob/master/.github/PULL_REQUEST_TEMPLATE.md?plain=1#L22 >> > > >> [5] >> > > >> >> > > >> > >> https://github.com/apache/mahout/blob/main/docs/community/pr-policy-and-review-guidelines.md >> > > >> [6] >> > > >> >> > > >> > >> https://github.com/apache/ozone-site/blob/master/src/pages/release-notes/2.0.0.md?plain=1#L408 >> > > >> [7] https://agents.md/ >> > > > >> > > >> > >> >
