Hello, On the topic of CLAUDE/AGENTS.md files, there was a rather interesting paper published recently about their effectiveness.
https://arxiv.org/abs/2602.11988 The TD;LR is * LLM-generated context files reduce success rates (0.5-2%) while increasing inference cost by 20-23% * Developer-written files help slightly (+4%), but verbose content that duplicates existing docs is pure cost * Codebase overviews don't improve navigation: agents find relevant files in the same number of steps regardless Basically, try to keep the content in these files to an absolute minimum focusing on information that cannot be inferred/discovered, one-line code patterns, disallowed behaviours, links to guides. Hope this helps. Cheers, Nathan From: Jarek Potiuk <[email protected]> Date: Monday, 2 March 2026 at 09:29 To: [email protected] <[email protected]> Subject: Re: [DISCUSS] Active approach to fighting with AI slop (while keeping maintainers in the driving seat) This Message Is From an External Sender This message came from outside your organization. Also: I am thinking of more tools like that - especially one that will allow us to auto-triage issues and use an LLM to speed up issue classification for provider releases (once suggested by Shahar I think) and many more things. The quality of good models is amazing. I am literally stunned by what Claude Code can do today - I tried it few months ago and the difference is night and day. I literally entirely Claude-Coded the whole thing without writing a single line of code myself. And since we have at the very least 6 months of free Claude Code Max for maintainers of big OSS projects https://urldefense.com/v3/__https://claude.com/contact-sales/claude-for-oss__;!!Ci6f514n9QsL8ck!mS8qTovb9go2kfJwcUGGry6yWpOOcdvB2IXJYYHcOEam-B2gTQQ_dcYm19lzIlgAKCiUragw0XqPXOJZ$ as of 3 days (liteally day after I paid for my first month)!!!) - Airflow definitely qualifies, so all core maintainers can get it regardless if their employees already pay them for it. So if you have not done it yet - apply :D. J. On Mon, Mar 2, 2026 at 10:22 AM Jarek Potiuk <[email protected]> wrote: > > maybe we should use the new LLMOperator form common.ai as an option > (hehe)! > Just joking, of course. > > Crossed my mind :D > > On Mon, Mar 2, 2026 at 10:20 AM Pavankumar Gopidesu < > [email protected]> wrote: > >> This is really cool, Jarek. Thanks for sharing. A tool like this is >> definitely necessary given the current volume of AI slope and PRs being >> submitted without proper context. >> >> maybe we should use the new LLMOperator form common.ai as an option >> (hehe)! >> Just joking, of course. >> >> Regards, >> Pavan >> >> On Mon, Mar 2, 2026 at 9:17 AM Jarek Potiuk <[email protected]> wrote: >> >> > > I think that we could later automate at least the dry-run execution of >> > the >> > script, along with Slack notification for highly-suspected issues/PRs. >> > Then, it would be easier for maintainers to react fast when needed. >> > >> > Yes. I would like to run it manually—ideally with several volunteer >> > maintainers - for a while to see how it works, improve and iterate and >> > possibly add more quality gates. When we have more confidence we could >> run >> > it automatically for some parts or even the whole process eventually >> > (especially for high-confidence/sensitive stuff), keeping the sensitive >> > parts with Human-In-The-Loop. >> > >> > But also (and this is my hope) - similarly to `breeze ci upgrade` it >> might >> > turn out that the process is so efficient and "nice" to follow that we >> > could continue trigger it manually, regularly, perhaps with a rotational >> > maintainer handling the triage. I think comments and actions coming >> from a >> > human maintainer have more value than those from a bot—even if the human >> > action is merely confirming what an automated system or LLM proposed. >> > >> > J. >> > >> > >> > On Mon, Mar 2, 2026 at 10:04 AM Shahar Epstein <[email protected]> >> wrote: >> > >> > > Amazing stuff Jarek! >> > > I think that we could later automate at least the dry-run execution of >> > the >> > > script, along with Slack notification for highly-suspected issues/PRs. >> > > Then, it would be easier for maintainers to react fast when needed. >> > > >> > > Looking forward for new AI-based features in breeze in particular, and >> > > Airflow in general :) >> > > >> > > >> > > Shahar >> > > >> > > >> > > On Sat, Feb 28, 2026, 04:59 Jarek Potiuk <[email protected]> wrote: >> > > >> > > > Hello everyone, >> > > > >> > > > While preparing for consensus on the assignment policy, I created PR >> > > > https://urldefense.com/v3/__https://github.com/apache/airflow/pull/62585__;!!Ci6f514n9QsL8ck!mS8qTovb9go2kfJwcUGGry6yWpOOcdvB2IXJYYHcOEam-B2gTQQ_dcYm19lzIlgAKCiUragw0bHQEs-s$. >> > > > This PR adds a new >> > command >> > > > to >> > > > Breeze, `breeze issues unassign`, which unassigns anyone who is not >> a >> > > > committer or collaborator. >> > > > >> > > > I want this to be the first of several Breeze commands I plan to >> add to >> > > > help manage the AI overhead and burden on maintainers. >> > > > >> > > > I got inspired bu Hugo van Kamerade's (my friend, Python release >> > manager) >> > > > tool >> > > > https://urldefense.com/v3/__https://hugovk.dev/blog/2026/gh-triage/__;!!Ci6f514n9QsL8ck!mS8qTovb9go2kfJwcUGGry6yWpOOcdvB2IXJYYHcOEam-B2gTQQ_dcYm19lzIlgAKCiUragw0TLnNyxs$. >> > > > He added the `gh` >> > plugin >> > > > that helps him manage spam coming to Python. I hope we can have very >> > > > similar set of commands and regular process of performing cleanup >> with >> > > the >> > > > issues/prs we are getting. >> > > > >> > > > BTW. I am using Claude Code to add those commands (so this is a bit >> > like >> > > > using AI to fight AI slop). But in a smart way. >> > > > >> > > > In our case we have `breeze` that we are already using for `ci >> upgrade` >> > > by >> > > > maintainers and I see no reason why we could not use our own CLI to >> > make >> > > us >> > > > far more efficient with assessing and quickly and efficiently >> > processing >> > > > incoming spam. >> > > > >> > > > Starting with AGENTS.md that describes what we expect (and instructs >> > > agents >> > > > to make good PRs) and changing our assignment process - I think we >> > should >> > > > proceed to implement step-by-step handling of the incoming traffic: >> > > > >> > > > a) Quickly assess how well PRs implement our expectations, point out >> > > > problems, and close them >> > > > >> > > > b) automatically telling the collaborators what is wrong with their >> PRs >> > > if >> > > > they are incomplete (for example when tests are failing, or when >> they >> > > need >> > > > a rebase) >> > > > >> > > > c) automatically responding to issues that they are incomplete and >> need >> > > > more information >> > > > >> > > > d) Allow filtering by area (so that maintainers focusing on a >> > particular >> > > > area can periodically review only the areas they are intereste >> > > > e) all that with some AI assistance (I plan to imlpement integration >> > with >> > > > some modern AI LLMs so that it is seamless for those maintainers who >> > > > already use some of those (including Cloud Code, GH Copilot >> > (maintainers >> > > > can apply for free access there), Codex and any models someone >> prefers >> > - >> > > > including local models). >> > > > >> > > > f) all that with maintainer in the driver's seat—we won't do those >> > things >> > > > fully automatically - but we will get reviewable action proposal in >> > bulk >> > > > that the maintainer will be able to accept, modify or reject. >> > > > >> > > > .... more... >> > > > >> > > > All that will be open to contribution and I will be happy to leading >> > > > introduction and disseminating those CLI options between >> maintainers to >> > > > make sure those get incorporated in our daily work - relieving some >> of >> > > the >> > > > burden we are all experiencing and sharing it between people. >> > > > >> > > > I think this is a viable approach to address our current burden >> > > > proactively, rather than waiting for others to act. >> > > > >> > > > This is also somewhat experimental since we haven't seen it done >> > before, >> > > so >> > > > suggestions, comments, ideas and PRs that could help us become more >> > > > efficient and better maintainers are most welcome. >> > > > >> > > > Let me know what you think. >> > > > >> > > > J. >> > > > >> > > >> > >> >
