Hi all, I've opened up https://issues.apache.org/jira/browse/FLINK-39477 given that there's consensus on getting this in, thank you all for your feedback!
Best regards, Martijn On Tue, Mar 24, 2026 at 5:52 AM Samrat Deb <[email protected]> wrote: > Hi Martijn, > > +1 for the initiative. > > I really liked the Iceberg-style guidelines [1]. AI-generated code must > face the same strict review standards as human code. The author must take > full ownership, explain the "why" behind the logic, and be able to debug > it. > > One word of caution regarding Leonard's idea of a support agent for the > user@flink list or Slack. Let's tread very carefully here. The blast > radius > for a hallucinated configuration, for example, mixing up > RocksDBStateBackend and HashMapStateBackend tuning. During a user's > production crisis, it is massive and could lead to data loss. > If we do build a support bot, it must be strictly constrained by our > official docs, maybe RAG-based initially and evolve from there and must > contain the right disclaimer. > > Bests, > Samrat > [1] https://iceberg.apache.org/contribute/#how-are-proposals-adopted > > On Mon, Mar 23, 2026 at 7:59 PM Ramin Gharib <[email protected]> > wrote: > > > Hi Martijn, > > > > +1 from me. > > > > Thanks for bringing this up. It makes total sense to get ahead of this > and > > set some clear guardrails as these tools become more popular. > > > > I really like the AGENTS.md approach. Explicitly laying out module-level > > context will definitely help reduce the noise from AI-generated PRs. > > > > Happy to see this move forward! > > > > Cheers, > > > > Ramin > > > > On Mon, Mar 23, 2026 at 2:59 PM Gustavo de Morais < > [email protected]> > > wrote: > > > > > Hi Martijn, > > > > > > Thanks for driving this and I'm +1 for the initiative so we share > > knowledge > > > across the community. I'm also +1 to starting with only the root > > AGENTS.md. > > > Correct and thoroughly reviewed AGENTS.md should be a follow-up for > each > > > module. In my experience, a shorter and correct context file is better > > than > > > longer, incorrect/outdated files which create a bad experience using > > > agents. > > > > > > > > > > > > > > > I've done a review for the PR for the things I'm aware of. It'd be nice > > to > > > have other eyes from people with different expertises. > > > > > > Kind regards, > > > > > > > > > > > > Gustavo > > > > > > > > > On Mon, 23 Mar 2026 at 12:58, Martijn Visser <[email protected] > > > > > wrote: > > > > > > > If there are no more comments, I'll start a vote later this week > > > > > > > > On Mon, Mar 16, 2026 at 1:22 PM Martijn Visser < > > [email protected] > > > > > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > Thanks for all the feedback and support. I've opened a draft PR [1] > > > that > > > > > covers points 1 and 2 from the original proposal. > > > > > > > > > > What's in the PR: > > > > > > > > > > 1. The PR includes an AGENTS.md at the repository root with > > > > prerequisites, > > > > > build/test commands, repository structure, architecture boundaries, > > > > common > > > > > change patterns, coding standards, testing standards, commit > > > conventions, > > > > > and boundaries. It also updates the PR template with a dedicated AI > > > > > disclosure section (checkbox + Generated-by tag). > > > > > 2. Module-level AGENTS.md files (point 3) are not (yet) included > and > > > can > > > > > be added incrementally by module maintainers. > > > > > > > > > > I've used Claude to generate this PR, to show how these tools can > > also > > > > > help us with these things. > > > > > > > > > > Let me also respond to the individual points raised. > > > > > > > > > > @Leonard: Interesting idea about an AI agent for the users' mailing > > > list, > > > > > but I'd think it would also be great if we could integrate it in > the > > > > Slack > > > > > workspace itself for those that are more active there. I think > > that's a > > > > > separate discussion worth having, but out of scope for this > proposal. > > > > Would > > > > > you like to start a dedicated thread for that? > > > > > > > > > > @Zakelly: Good point about architecture, performance, and code > > > > > reusability. The AGENTS.md includes an "Architecture Boundaries" > > > section > > > > > and a "Common Change Patterns" section that maps change types to > the > > > > > modules they affect, which should help steer AI agents in the right > > > > > direction. Regarding GitHub labels and bot reminders for > AI-generated > > > > PRs: > > > > > I think that's a good idea but would be a separate follow-up. I > think > > > we > > > > > should get the baseline guidelines in place first. > > > > > > > > > > @Vaquar: Thanks for sharing. I think AGENTS.md and the PR template > > > > > disclosure are the right starting point for Flink. Deterministic > > > > > build-system gates are an interesting idea, but I'd want to see how > > the > > > > > community's experience with AI contributions evolves before adding > > that > > > > > level of enforcement. If you'd like to propose something concrete > for > > > > > Flink, a FLIP would be the right vehicle for that. > > > > > > > > > > Process question: > > > > > > > > > > Since these are contribution guidelines rather than API or > > architecture > > > > > changes, I think a vote on this thread would be sufficient. But if > > the > > > > > community feels this warrants a formal FLIP, I'm happy to go that > > > route. > > > > > What do others think? > > > > > > > > > > Feedback on the PR is welcome. > > > > > > > > > > Thanks, Martijn > > > > > > > > > > [1] https://github.com/apache/flink/pull/27776 > > > > > > > > > > On Sat, Mar 14, 2026 at 6:13 AM vaquar khan < > [email protected]> > > > > > wrote: > > > > > > > > > >> Hi Martijn, Zakelly, and everyone, > > > > >> > > > > >> +1 to adding AGENTS.md. It's a great first step > > > > >> as all other Apache projects follow the same approach. > > > > >> > > > > >> I saw this thread and thought I'd chime in because I'm actually > > > working > > > > on > > > > >> a draft KIP proposal on this exact topic right now. > > > > >> > > > > >> To Zakelly's point about AI falling short on architecture: > AGENTS.md > > > is > > > > a > > > > >> great guide, but it’s ultimately a "soft control." In my > experience, > > > > LLMs > > > > >> probabilistically ignore markdown instructions when their context > > > > windows > > > > >> fill up or prompts drift. > > > > >> > > > > >> To really stop the review fatigue, my KIP draft proposes adding a > > > > >> deterministic "hard control" hooked directly into the build > system. > > It > > > > >> uses > > > > >> local AST parsing to automatically block PRs that are mostly empty > > > > >> scaffolding/docstrings (low logic density) or violate core > > > architectural > > > > >> patterns. It catches the "AI slop" before a human ever has to look > > at > > > > it. > > > > >> > > > > >> If the community is interested, I’d be happy to share my draft > KIP. > > It > > > > >> might be a helpful reference if we want to explore a similar > > > Maven-based > > > > >> gate for Flink. > > > > >> > > > > >> Regards, > > > > >> > > > > >> Vaquar Khan > > > > >> > > > > >> On Thu, Mar 12, 2026 at 9:57 PM Zakelly Lan < > [email protected]> > > > > >> wrote: > > > > >> > > > > >> > Hi, Martjin, > > > > >> > > > > > >> > Thanks for bringing this up. I'd +1 on this proposal. > > > > >> > > > > > >> > In the guidelines, I'd like to emphasize that contributors and > > > > reviewers > > > > >> > should pay particular attention to architecture, performance, > and > > > code > > > > >> > reusability. Based on my experience working with AI, code agents > > > often > > > > >> fall > > > > >> > short in these. > > > > >> > > > > > >> > And furthermore, I suggest we introduce mechanisms to ensure a > > > smooth > > > > >> > review process for AI-generated code, such as adding github > labels > > > > and a > > > > >> > special reminder for reviewers from the flink's github bot. > > > > >> > > > > > >> > > > > > >> > Best, > > > > >> > Zakelly > > > > >> > > > > > >> > > > > > >> > On Fri, Mar 13, 2026 at 10:09 AM Rion Williams < > > > [email protected] > > > > > > > > > >> > wrote: > > > > >> > > > > > >> > > Hi Martijn, > > > > >> > > > > > > >> > > I think this is a great idea and definitely an effort worth > > > > pursuing — > > > > >> > > it’s actually something I’ve been considering experimenting > with > > > > >> myself. > > > > >> > A > > > > >> > > clear +1 from me, and I’d be happy to help as the effort > > develops. > > > > >> > > > > > > >> > > On the reviewer side, we already have a pretty solid set of > > > > guardrails > > > > >> > and > > > > >> > > review processes in place, which is great. That said, it’s > still > > > > easy > > > > >> to > > > > >> > > become inundated by a large, random PR with little or no > context > > > > >> > (sometimes > > > > >> > > clearly AI-driven). Establishing some guidelines specifically > > > around > > > > >> AI > > > > >> > > usage — both for providing development context and for helping > > > with > > > > >> the > > > > >> > > review/audit process — would be fantastic, even if we start > > small > > > > and > > > > >> > > gradually evolve things over time. > > > > >> > > > > > > >> > > Thanks for kicking this off. Looking forward to hearing what > > > others > > > > >> > think. > > > > >> > > > > > > >> > > Cheers, > > > > >> > > > > > > >> > > Rion > > > > >> > > > > > > >> > > > > > > >> > > > On Mar 12, 2026, at 8:50 PM, Leonard Xu <[email protected]> > > > > wrote: > > > > >> > > > > > > > >> > > > Hi Martijn, > > > > >> > > > > > > > >> > > > Thanks for kicking off this discussion. I've been thinking > > along > > > > >> > similar > > > > >> > > lines recently, so you have a +1 from me on this proposal. > > > > >> > > > > > > > >> > > > I also have a suggestion regarding activity on the users' > > > mailing > > > > >> list. > > > > >> > > Could we consider introducing an AI agent to help answer > users' > > > > >> > questions? > > > > >> > > I've noticed that many inquiries on user@flink currently go > > > > >> unanswered, > > > > >> > > yet most of them could be effectively addressed by an agent. > > > > >> > > > > > > > >> > > > > > > > >> > > > Best, > > > > >> > > > Leonard > > > > >> > > > > > > > >> > > >> 2026 3月 13 05:03,Martijn Visser <[email protected]> > > 写道: > > > > >> > > >> > > > > >> > > >> Hi all, > > > > >> > > >> > > > > >> > > >> I'd like to start a discussion about how the Flink > community > > > > should > > > > >> > > handle > > > > >> > > >> AI-assisted contributions and how we can make the Flink > > > codebase > > > > >> more > > > > >> > > >> accessible to AI tooling. > > > > >> > > >> > > > > >> > > >> The ASF has published guidance on generative AI tooling > [1], > > > and > > > > >> > several > > > > >> > > >> Apache projects have already adopted project-specific > > > guidelines > > > > on > > > > >> > top > > > > >> > > of > > > > >> > > >> that. I think Flink should too. > > > > >> > > >> > > > > >> > > >> The most comprehensive example I've seen is Apache Airflow. > > > > They've > > > > >> > > added > > > > >> > > >> an AGENTS.md [2] with instructions for AI coding agents, > > > > including > > > > >> PR > > > > >> > > >> templates with an AI disclosure checkbox, a self-review > > > > checklist, > > > > >> and > > > > >> > > the > > > > >> > > >> Generated-by: commit message token that the ASF guidance > > > > >> recommends. > > > > >> > > Apache > > > > >> > > >> Iceberg recently adopted AI contribution guidelines [3] > > focused > > > > on > > > > >> > > >> contributor accountability: you must be able to debug, > > explain, > > > > and > > > > >> > own > > > > >> > > the > > > > >> > > >> changes. Other projects like Paimon [4], Mahout [5], and > > Ozone > > > > [6] > > > > >> > have > > > > >> > > >> adopted similar policies. > > > > >> > > >> > > > > >> > > >> I'd like to propose the following for Flink: > > > > >> > > >> > > > > >> > > >> 1. Adopt contribution guidelines for AI-assisted PRs. > > > > Contributors > > > > >> > must > > > > >> > > >> disclose when AI tooling was used (using Generated-by: > <Tool > > > Name > > > > >> and > > > > >> > > >> Version> in the commit message), and must be able to > explain > > > and > > > > >> take > > > > >> > > >> ownership of all changes. AI-generated code is held to the > > same > > > > >> review > > > > >> > > >> standards as human-written code. > > > > >> > > >> 2. Add AGENTS.md files to the Flink repository. AGENTS.md > [7] > > > is > > > > a > > > > >> > > >> convention for giving AI coding agents project-specific > > > context. > > > > It > > > > >> > can > > > > >> > > >> contain information like build instructions, test commands, > > > > coding > > > > >> > > >> conventions, commit message format. I think we should add > one > > > at > > > > >> the > > > > >> > > root > > > > >> > > >> of apache/flink. > > > > >> > > >> 3. Add module-level context for AI tooling. This is where I > > > think > > > > >> we > > > > >> > can > > > > >> > > >> take a step forward. Each Flink module (e.g. > > > > flink-streaming-java, > > > > >> > > >> flink-table-planner, flink-clients) would benefit from its > > own > > > > >> > AGENTS.md > > > > >> > > >> explaining the module's role, key abstractions, testing > > > patterns, > > > > >> and > > > > >> > > >> common pitfalls. This also serves as architectural > > > documentation > > > > >> that > > > > >> > > helps > > > > >> > > >> human contributors. > > > > >> > > >> > > > > >> > > >> I'm looking forward to hearing what others think about > this. > > > > >> > > >> > > > > >> > > >> Best regards, > > > > >> > > >> > > > > >> > > >> Martijn > > > > >> > > >> > > > > >> > > >> [1] https://www.apache.org/legal/generative-tooling.html > > > > >> > > >> [2] https://github.com/apache/airflow/blob/main/AGENTS.md > > > > >> > > >> [3] > > > > >> > > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://iceberg.apache.org/contribute/#guidelines-for-ai-assisted-contributions > > > > >> > > >> [4] > > > > >> > > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://github.com/apache/paimon/blob/master/.github/PULL_REQUEST_TEMPLATE.md?plain=1#L22 > > > > >> > > >> [5] > > > > >> > > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://github.com/apache/mahout/blob/main/docs/community/pr-policy-and-review-guidelines.md > > > > >> > > >> [6] > > > > >> > > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://github.com/apache/ozone-site/blob/master/src/pages/release-notes/2.0.0.md?plain=1#L408 > > > > >> > > >> [7] https://agents.md/ > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > >
