Hi all, Thanks for all the feedback and support. I've opened a draft PR [1] that covers points 1 and 2 from the original proposal.
What's in the PR: 1. The PR includes an AGENTS.md at the repository root with prerequisites, build/test commands, repository structure, architecture boundaries, common change patterns, coding standards, testing standards, commit conventions, and boundaries. It also updates the PR template with a dedicated AI disclosure section (checkbox + Generated-by tag). 2. Module-level AGENTS.md files (point 3) are not (yet) included and can be added incrementally by module maintainers. I've used Claude to generate this PR, to show how these tools can also help us with these things. Let me also respond to the individual points raised. @Leonard: Interesting idea about an AI agent for the users' mailing list, but I'd think it would also be great if we could integrate it in the Slack workspace itself for those that are more active there. I think that's a separate discussion worth having, but out of scope for this proposal. Would you like to start a dedicated thread for that? @Zakelly: Good point about architecture, performance, and code reusability. The AGENTS.md includes an "Architecture Boundaries" section and a "Common Change Patterns" section that maps change types to the modules they affect, which should help steer AI agents in the right direction. Regarding GitHub labels and bot reminders for AI-generated PRs: I think that's a good idea but would be a separate follow-up. I think we should get the baseline guidelines in place first. @Vaquar: Thanks for sharing. I think AGENTS.md and the PR template disclosure are the right starting point for Flink. Deterministic build-system gates are an interesting idea, but I'd want to see how the community's experience with AI contributions evolves before adding that level of enforcement. If you'd like to propose something concrete for Flink, a FLIP would be the right vehicle for that. Process question: Since these are contribution guidelines rather than API or architecture changes, I think a vote on this thread would be sufficient. But if the community feels this warrants a formal FLIP, I'm happy to go that route. What do others think? Feedback on the PR is welcome. Thanks, Martijn [1] https://github.com/apache/flink/pull/27776 On Sat, Mar 14, 2026 at 6:13 AM vaquar khan <[email protected]> wrote: > Hi Martijn, Zakelly, and everyone, > > +1 to adding AGENTS.md. It's a great first step > as all other Apache projects follow the same approach. > > I saw this thread and thought I'd chime in because I'm actually working on > a draft KIP proposal on this exact topic right now. > > To Zakelly's point about AI falling short on architecture: AGENTS.md is a > great guide, but it’s ultimately a "soft control." In my experience, LLMs > probabilistically ignore markdown instructions when their context windows > fill up or prompts drift. > > To really stop the review fatigue, my KIP draft proposes adding a > deterministic "hard control" hooked directly into the build system. It uses > local AST parsing to automatically block PRs that are mostly empty > scaffolding/docstrings (low logic density) or violate core architectural > patterns. It catches the "AI slop" before a human ever has to look at it. > > If the community is interested, I’d be happy to share my draft KIP. It > might be a helpful reference if we want to explore a similar Maven-based > gate for Flink. > > Regards, > > Vaquar Khan > > On Thu, Mar 12, 2026 at 9:57 PM Zakelly Lan <[email protected]> wrote: > > > Hi, Martjin, > > > > Thanks for bringing this up. I'd +1 on this proposal. > > > > In the guidelines, I'd like to emphasize that contributors and reviewers > > should pay particular attention to architecture, performance, and code > > reusability. Based on my experience working with AI, code agents often > fall > > short in these. > > > > And furthermore, I suggest we introduce mechanisms to ensure a smooth > > review process for AI-generated code, such as adding github labels and a > > special reminder for reviewers from the flink's github bot. > > > > > > Best, > > Zakelly > > > > > > On Fri, Mar 13, 2026 at 10:09 AM Rion Williams <[email protected]> > > wrote: > > > > > Hi Martijn, > > > > > > I think this is a great idea and definitely an effort worth pursuing — > > > it’s actually something I’ve been considering experimenting with > myself. > > A > > > clear +1 from me, and I’d be happy to help as the effort develops. > > > > > > On the reviewer side, we already have a pretty solid set of guardrails > > and > > > review processes in place, which is great. That said, it’s still easy > to > > > become inundated by a large, random PR with little or no context > > (sometimes > > > clearly AI-driven). Establishing some guidelines specifically around AI > > > usage — both for providing development context and for helping with the > > > review/audit process — would be fantastic, even if we start small and > > > gradually evolve things over time. > > > > > > Thanks for kicking this off. Looking forward to hearing what others > > think. > > > > > > Cheers, > > > > > > Rion > > > > > > > > > > On Mar 12, 2026, at 8:50 PM, Leonard Xu <[email protected]> wrote: > > > > > > > > Hi Martijn, > > > > > > > > Thanks for kicking off this discussion. I've been thinking along > > similar > > > lines recently, so you have a +1 from me on this proposal. > > > > > > > > I also have a suggestion regarding activity on the users' mailing > list. > > > Could we consider introducing an AI agent to help answer users' > > questions? > > > I've noticed that many inquiries on user@flink currently go > unanswered, > > > yet most of them could be effectively addressed by an agent. > > > > > > > > > > > > Best, > > > > Leonard > > > > > > > >> 2026 3月 13 05:03,Martijn Visser <[email protected]> 写道: > > > >> > > > >> Hi all, > > > >> > > > >> I'd like to start a discussion about how the Flink community should > > > handle > > > >> AI-assisted contributions and how we can make the Flink codebase > more > > > >> accessible to AI tooling. > > > >> > > > >> The ASF has published guidance on generative AI tooling [1], and > > several > > > >> Apache projects have already adopted project-specific guidelines on > > top > > > of > > > >> that. I think Flink should too. > > > >> > > > >> The most comprehensive example I've seen is Apache Airflow. They've > > > added > > > >> an AGENTS.md [2] with instructions for AI coding agents, including > PR > > > >> templates with an AI disclosure checkbox, a self-review checklist, > and > > > the > > > >> Generated-by: commit message token that the ASF guidance recommends. > > > Apache > > > >> Iceberg recently adopted AI contribution guidelines [3] focused on > > > >> contributor accountability: you must be able to debug, explain, and > > own > > > the > > > >> changes. Other projects like Paimon [4], Mahout [5], and Ozone [6] > > have > > > >> adopted similar policies. > > > >> > > > >> I'd like to propose the following for Flink: > > > >> > > > >> 1. Adopt contribution guidelines for AI-assisted PRs. Contributors > > must > > > >> disclose when AI tooling was used (using Generated-by: <Tool Name > and > > > >> Version> in the commit message), and must be able to explain and > take > > > >> ownership of all changes. AI-generated code is held to the same > review > > > >> standards as human-written code. > > > >> 2. Add AGENTS.md files to the Flink repository. AGENTS.md [7] is a > > > >> convention for giving AI coding agents project-specific context. It > > can > > > >> contain information like build instructions, test commands, coding > > > >> conventions, commit message format. I think we should add one at the > > > root > > > >> of apache/flink. > > > >> 3. Add module-level context for AI tooling. This is where I think we > > can > > > >> take a step forward. Each Flink module (e.g. flink-streaming-java, > > > >> flink-table-planner, flink-clients) would benefit from its own > > AGENTS.md > > > >> explaining the module's role, key abstractions, testing patterns, > and > > > >> common pitfalls. This also serves as architectural documentation > that > > > helps > > > >> human contributors. > > > >> > > > >> I'm looking forward to hearing what others think about this. > > > >> > > > >> Best regards, > > > >> > > > >> Martijn > > > >> > > > >> [1] https://www.apache.org/legal/generative-tooling.html > > > >> [2] https://github.com/apache/airflow/blob/main/AGENTS.md > > > >> [3] > > > >> > > > > > > https://iceberg.apache.org/contribute/#guidelines-for-ai-assisted-contributions > > > >> [4] > > > >> > > > > > > https://github.com/apache/paimon/blob/master/.github/PULL_REQUEST_TEMPLATE.md?plain=1#L22 > > > >> [5] > > > >> > > > > > > https://github.com/apache/mahout/blob/main/docs/community/pr-policy-and-review-guidelines.md > > > >> [6] > > > >> > > > > > > https://github.com/apache/ozone-site/blob/master/src/pages/release-notes/2.0.0.md?plain=1#L408 > > > >> [7] https://agents.md/ > > > > > > > > > >
