Re: [DISCUSS] FIP-34: Making Fluss an AI-Native Project

Yang Wang Thu, 19 Mar 2026 04:13:14 -0700

Hi Mehul,

Thank you for the thorough feedback and the +1 on Phase 1! These are
extremely well-researched suggestions — all zero-cost and immediately
actionable. Let me respond to each one.


1. CodeRabbit for AI-Assisted PR Review**

This is a great fit for Dimension 1's Phase 3 tooling. Free Pro tier for
public repos, zero ASF runner consumption, context-aware line-by-line
review with configurable rules via `.coderabbit.yaml` — the value
proposition is compelling. We could configure Fluss-specific rules for
architecture boundaries, ASF license headers, and coding conventions, which
directly complements the AI Contribution Guidelines in Phase 2.

I'll add this as a concrete candidate in Phase 3's development ecosystem
tooling.

2. kapa.ai Doc Bot**

I think this aligns well with Dimension 2's broader goal of making Fluss
more accessible to AI-powered workflows. While the FIP's Dimension 2
roadmap focuses on REST API → CLI → Skills for programmatic operation, an
AI-powered documentation experience is a valuable complementary direction
that lowers the barrier for new users. The free open source program (up to
10k questions/month) and the fact that projects like Polars, LangChain, and
Nuxt are already using it makes this worth exploring.

3. GitHub AI Issue Labeler + Moderator**

This fits naturally into Dimension 1's Phase 3 tooling. Being GitHub-native
with zero extra API keys needed makes it the lowest-friction option among
all the suggestions. Auto-labeling issues by component (fluss-lake-iceberg,
fluss-flink, fluss-server, etc.) would be immediately useful for triage,
and the AI moderator can help maintain community health as the project
grows. I'll add this to the Phase 3 tooling exploration.

4. OpenSSF Scorecard + Dependabot**

Agreed — these are proven security best practices that many ASF projects
already use, and adopting them at zero cost is a no-brainer. While not
AI-specific, they strengthen the overall development infrastructure that
supports AI-assisted workflows. I'll track this as an actionable
improvement alongside the FIP rollout.

5. Module-Level AGENTS.md Priority**

Your proposed starting point — fluss-server, fluss-lake-iceberg, and
fluss-flink-common — is spot on. These are the three most architecturally
complex modules where AI agents are most likely to make mistakes without
proper context. I'll update section 3.1 of the FIP to include this
prioritization guidance.

One open question for the community

Several of these suggestions (CodeRabbit, kapa.ai) involve integrating
third-party commercial services — free for open source, but commercial
nonetheless. I'm not sure whether ASF has specific policies or precedents
around installing third-party GitHub Apps or embedding third-party SaaS
widgets on project websites. If anyone in the community has experience or
knowledge about ASF's stance on this, I'd appreciate the input so we can
move forward with clarity.

Thanks again for the detailed and actionable suggestions, Mehul. This is
exactly the kind of concrete input that makes the FIP better.

Best regards,
Yang Wang


Mehul Batra <[email protected]> 于2026年3月19日周四 17:10写道：

> Hi Yang,
>
> Thank you for driving this FIP. +1 on landing Phase 1 as proposed.
>
> A few additional suggestions, all zero-cost for the project:
>
> 1. CodeRabbit for AI-Assisted PR Review
>
> CodeRabbit [1] offers its full Pro tier free forever for public
> repositories. It provides context-aware, line-by-line code review on every
> PR, PR summaries, incremental reviews on new commits, and interactive chat
> via @coderabbitai in PR comments. It runs on CodeRabbit's own
> infrastructure, so it doesn't consume ASF GitHub Actions runners.
>
> Setup is just installing the CodeRabbit GitHub App and adding a
> .coderabbit.yaml to configure Fluss-specific review rules (architecture
> boundaries, ASF license headers, naming conventions, test expectations).
>
> 2. "Ask AI" Doc Bot for the Fluss Website
>
> On Dimension 2 (AI-Friendly Product), I'd suggest exploring kapa.ai [2]
> for
> an AI-powered "Ask AI" widget on the Fluss documentation site. Kapa offers
> a free open source program (up to 10k questions/month) for qualifying
> projects. It ingests your docs and lets users ask natural language
> questions directly on the website, similar to what Polars, LangChain, and
> Nuxt already use.
>
> Fluss qualifies (Apache-licensed, non-commercial, publicly available). This
> would lower the barrier for new users trying to understand Fluss concepts,
> configuration, and the lakehouse integration without digging through pages
> manually.
>
> 3. GitHub AI Issue Labeler + Moderator for Triage
>
> GitHub recently released two free AI-powered Actions using the GitHub
> Models inference API [3]: an AI assessment comment labeler that
> auto-categorizes issues (bug, feature request, question, etc.) and an AI
> moderator that detects spam and AI-generated low-quality content. Both use
> the workflow's GITHUB_TOKEN with models:read permission, so no extra API
> key is needed.
>
> For Fluss this could auto-label issues by component (fluss-lake-iceberg,
> fluss-flink, fluss-server, etc.) and type, reducing manual triage overhead
> as the project grows.
>
> 4. OpenSSF Scorecard + Dependabot for Security
>
> OpenSSF Scorecard [4] is a free GitHub Action that runs automated security
> health checks on every push or on a schedule. It checks branch protection,
> dependency update tooling, SAST presence, signed releases, code review
> practices, vulnerability disclosure, and more, then surfaces findings in
> the GitHub Security tab.
>
> Combined with Dependabot (already free for public repos), this gives Fluss
> automated dependency vulnerability alerts and security posture scoring at
> zero cost. Many ASF projects already use both.
>
> 5. Module-Level AGENTS.md Content
>
> For the module-level AGENTS.md, I'd suggest starting with the three most
> architecturally complex modules: fluss-lake-iceberg (tiering writer
> architecture, shading constraints, Iceberg catalog rules),
> fluss-flink-common (Flink connector contracts, multi-version
> compatibility), and fluss-server (coordinator logic, replication, ZooKeeper
> interactions).
>
> Best regards,
> Mehul
>
> [1] https://www.coderabbit.ai/open-source
> [2] https://docs.kapa.ai/kapa-for-open-source
> [3]
>
> https://github.blog/changelog/2025-09-05-github-actions-ai-labeler-and-moderator-with-the-github-models-inference-api/
> [4] https://github.com/ossf/scorecard-action
>

Re: [DISCUSS] FIP-34: Making Fluss an AI-Native Project

Reply via email to