Hi Keith,

Thank you for the feedback — this is an important point that the FIP was
missing.

You're right that the ASF Generative Tooling Guidance sections 2.2 and 2.3
deserve explicit attention. The original FIP referenced the ASF guidance
but didn't address the copyright verification aspect specifically.

I've updated the FIP in two places based on your suggestion:

**1. Phase 2 (Governance) — new guideline item**

Added a "Copyright and licensing awareness" bullet to the AI Contribution
Guidelines (section 2.1). This sets the expectation that contributors
should obtain reasonable certainty per ASF sections 2.2/2.3, and if their
AI tool provides similarity information or code scanning results are
available, include that in the PR description. This is actionable today
with no tooling dependency.

**2. Phase 3 (Tooling) — new section 3.4**

Added "Copyright Compliance Tooling (Exploratory)" as a new section. This
captures your idea of an automated mechanism — such as a review agent or CI
check — that scans AI-generated code for potential similarity to
copyrighted material and appends results to the PR description, usable by
both PR authors and reviewers.

I marked this as exploratory because, as you noted, experimentation would
be needed to find a suitable mechanism. The tooling landscape for this kind
of verification is still maturing — I'm not aware of a production-ready
open-source solution yet that reliably checks ASF 2.2/2.3 conditions. But I
agree this is a direction worth pursuing, and having it in the FIP roadmap
means we'll actively track developments and experiment when suitable tools
emerge.

If you have specific tools or approaches in mind for the experimentation,
I'd be very interested to hear more.

Best regards,
Yang Wang


Keith Lee <[email protected]> 于2026年3月19日周四 14:09写道:

> Hello,
>
> Thank you Yang for raising this FIP, it has well laid suggestions for usage
> of generative tools.
>
> Beyond usage alone, I want to suggest that we add mechanism (preferably) or
> guideline specifically for addressing the risk of generating restrictive
> copyrightable material in ASF Generative Tooling Guidance[1]:
>
> > A contributor obtains reasonable certainty that conditions 2.2 or 2.3 are
> met if the AI tool itself provides sufficient information about output that
> may be similar to training data, or from code scanning results.
>
> I believe experimentation would be needed to come up with a suitable
> mechanism to perform this check e.g. skill that spin up different reviewer
> agent which specifically checks that either one of 2.2 or 2.3 is true, this
> skill then can be ran by both human author or human reviewer, of which its
> output is appended to PR description.
>
> Best regards
> Keith Lee
>
> [1] https://www.apache.org/legal/generative-tooling.html
>
> On Thu, 19 Mar 2026 at 04:56, Yang Wang <[email protected]> wrote:
>
> > Hi all,
> >
> > I'd like to start a discussion on FIP-34: Making Fluss an AI-Native
> > Project.
> >
> > The full proposal is available on the wiki: FIP-34: Making Fluss an
> > AI-Native Project
> > <
> >
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-34%3A+Making+Fluss+an+AI-Native+Project
> > >
> >
> > TL;DR
> >
> > This FIP proposes embracing AI in two complementary dimensions:
> >
> >    - AI-Friendly Development — making it easier for AI agents to help
> >    develop Fluss
> >    - AI-Friendly Product — making Fluss itself a product that AI agents
> can
> >    operate natively
> >
> > The full FIP covers a phased roadmap across both dimensions. In this
> email,
> > I'd like to focus the discussion on the immediate actions (Dimension 1,
> > Phase 1) that we can land quickly, while gathering feedback on the
> broader
> > vision.
> >
> > Background
> >
> > The AI-assisted development landscape is evolving rapidly. The ASF has
> > published guidance on generative AI tooling [1], and several Apache
> > projects have already taken concrete steps: Flink [2], Iceberg [3],
> Airflow
> > [4], SkyWalking, Pinot, and ShardingSphere are all actively adopting
> > AI-friendly practices.
> >
> > As a young project, Fluss has the advantage of building AI-nativeness in
> > from the start. I believe this is a competitive advantage on both fronts:
> > attracting contributors who use AI tools, and attracting users who build
> > AI-powered applications.
> >
> > Immediate Actions (Dimension 1, Phase 1)
> >
> > I'd like to propose landing the following three changes once we have
> > consensus:
> > 1) Add AGENTS.md
> >
> > AGENTS.md <https://agents.md/> [5] is an open standard adopted by 60k+
> > open-source projects. It provides a predictable location for AI coding
> > agents to find project-specific context — build commands, test
> > instructions, coding conventions, and architecture overview.
> >
> > I've drafted an AGENTS.md for Fluss covering prerequisites, build/test
> > commands, repository structure (17 modules), architecture boundaries,
> > coding standards, testing conventions, commit/PR conventions, and AI
> > contribution rules.
> > 2) Add CLAUDE.md
> >
> > Claude Code — one of the most popular AI coding tools among open-source
> > contributors — auto-loads CLAUDE.md at session start but does not
> auto-load
> > AGENTS.md. We'll add CLAUDE.md as a symbolic link to AGENTS.md (ln -s
> > AGENTS.md CLAUDE.md), following the same approach used by Apache Airflow
> > and Apache Fory. This ensures Claude Code users get project context out
> of
> > the box with zero content duplication.
> > 3) Update PR Template with AI Disclosure
> >
> > Following the ASF Generative Tooling Guidance [1], add an AI disclosure
> > section to our PR template:
> >
> >    - An AI disclosure checkbox
> >    - A Generated-by: <Tool Name and Version> tag
> >
> > This is lightweight, non-intrusive, and aligns with what Flink, Airflow,
> > and Paimon are doing.
> >
> > Broader Vision (see FIP for details)
> >
> > The FIP also outlines longer-term plans that I'd welcome early feedback
> on:
> >
> >    - Governance (Dimension 1, Phase 2): AI contribution guidelines,
> >    self-review checklist — drawing from Iceberg's experience with AI PR
> > quality
> >    - Tooling (Dimension 1, Phase 3): Module-level AGENTS.md files, Maven
> >    worktree isolation extension [6], development environment ecosystem
> >    - AI-Friendly Product (Dimension 2): Agent-friendly REST API with
> >    OpenAPI spec, CLI tool with JSON/table dual output, official Skills
> > module
> >    for AI agent marketplaces
> >
> > Dimension 2 items will have their own separate FIPs when the time comes.
> >
> > Process
> >
> > For the immediate Phase 1 changes (AGENTS.md + CLAUDE.md + PR template),
> I
> > think a vote on this thread would be sufficient, as these are
> contribution
> > guidelines and developer tooling rather than API or architecture changes.
> >
> > I plan to open a PR covering Phase 1 once we have consensus.
> >
> > Looking forward to your thoughts — both on the immediate actions and the
> > broader vision.
> >
> >
> > Best regards, Yang Wang
> >
> >
> > [1] https://www.apache.org/legal/generative-tooling.html [2]
> > https://lists.apache.org/thread/l0n4w86v1o5cwkqpqtf2q7lb7zdyrymf [3]
> > https://lists.apache.org/thread/129f3n8zgdck3nc5hmdf63sntzcy9tjg [4]
> > https://github.com/apache/airflow/blob/main/AGENTS.md [5]
> > https://agents.md/
> > [6]
> >
> >
> https://central.sonatype.com/artifact/io.github.platinumhamburg/maven-local-share-extension
> >
>

Reply via email to