"I want to be involved in a community of humans working to build software. I do not want to see LLMs producing so much output that other people need LLMs to summarise it, with no humans looking at things"
I definitely share this feeling and for me we're already there. So much output (most of it crap) to ingest that I need multiple agents to answer for me on comments/PR, etc... (I'm sure, generating more crap along the way even if I read and double check everything my agents are producing). Nowadays it feels really difficult to have time to look at every piece, in a good level of detail myself unless we're ok with queuing up a significant number of code reviews. Also automatic code reviews from copilot and all are just making the review process longer and even more complicated - Every PR has dozens of comments to address and to catch up with for context. On Tue, Jun 9, 2026 at 2:48 PM Tzu-ping Chung via dev < [email protected]> wrote: > While I agree that banning AI-opened PRs does not fundamentally anything, > I suspect it would actually be quite useful. Ultimately, we are not against > AI-assisted, or even GENERATED PRs. What we want to prevent is low-effort > submissions, and the additional hurdle would help drive away a lot of > spams. > It’s similar to captcha. Sure, you can totally work around it with not too > much effort, but most of the low effort actors would not even care to go > through that, but instead choose to bother someone else. In the mean time, > the effort is still low enough that legitimate contributions would not be > bothered too much. It’s a reasonable and practical proof-of-effort IMO. > > I am in favour of this personally. > > TP > > > > On 9 Jun 2026, at 20:30, Ash Berlin-Taylor <[email protected]> wrote: > > > > I don’t care one way or another about using AI as a tool in CI, that is > secondary to my goal which is to try and do something to make it clear what > we expect from people wanting to contribute to Airflow, namely: > > > > 1. Human involvement. > > > > By submitting a PR you are saying “yes I want to be a member of the > community”. Agents submitting without human interaction go against this. > > > > 2. Human ownership. > > > > It is _your responsibility_ as the PR author to follow up on it, address > comments, and request reviews. > > > > > > I frankly find the AI generated triage comments verbose, and a waste of > time and pure noise even without the `@` spam. > > > > If the user doesn’t care enough about their own PR to follow up on it: > close it after some time. We don’t need to baby sit them. Nor do I need yet > more commit email messages to read through. > > > > > > So how does it sound: It sounds like hell to me and an even bigger waste > of electricity in a climate crisis. > > > > I want to be involved in a community of humans working to build > software. I do not want to see LLMs producing so much output that other > people need LLMs to summarise it, with no humans looking at things. > > > > -ash > > > >> On 9 Jun 2026, at 13:18, Jarek Potiuk <[email protected]> wrote: > >> > >>> Why? Because AI “instructions” cannot be trusted. And I am after a > signal > >> that people are blindly using LLMs without enough human introversion. > >> > >> But is not that what you are doing? This proposal is about adding > another > >> AI instruction (just hidden in HTML) - how is that going to help? > >> > >>> You already updated the instructions to not `@` the reviewer here > >> > >> Indeed, LLMs are not deterministic by nature. But they are improvable. > >> Through iterations of refinement and adding more guardrails we can > improve > >> it—and this is exactly why I am running it manually to make it better. > This > >> is the same as in regular breeze development in the past. Initially, > there > >> were many small issues - and I remember how you complained about them > and > >> how unnecessary they seemed—yet we now perfected it over time. Now, it > >> allows all contributors and maintainers to work much more efficiently > and > >> lose less time. BTW. Thanks for notifying me; I must strengthen this one > >> and see why, as there might be another improvement to implement. This is > >> also why we are not "yet" doing CI analysis by AI - because I want to > >> iterate on it and fix it in the way to know which parts are > deterministic. > >> > >>> I want to do anything and everything to reduce the drive by > contribution > >> with no human activity. I’m happy to spend my time helping humans, but > if > >> they are just going to feed that back to an LLM and burn an egregious > >> amount of carbon: no thank you. > >> > >> And again I am not sure how the proposal to add that instruction would > >> address this particular issue? Are you just proposing to add another > >> instruction for the LLM (or am I wrong?). How does it solve the problem? > >> > >> From what I understand we have two basic proposals here - that > contradict > >> each other: > >> > >> * Ash - do not use AI to fight with AI at all > >> * Amoght, Shahar - use AI in CI > >> > >> But I think, the triage I am running now shows a third way: > >> > >> * we use AI to try out and generate triage action and figure out which > >> parts are practically 100% deterministic and can help with triage (this > is > >> the stats I am gathering now) > >> * qe use AI to convert the SKILLS we have into deterministic CI code > that > >> does those triage steps (no AI used at all at runtime) > >> * we continue perfecting the manually-triggered AI SKILLS to get more AI > >> heuristics that we can turn into deterministic CI code > >> > >> This seems to fulfill seemingly contradictory expectations that > different > >> people have in a nice way. I am about to produce stats from the last run > >> and was just about to propose this approach. > >> > >> How does it sound Ash, Amogh, Shahar and others ? > >> > >> J. > >> > >> > >> On Tue, Jun 9, 2026 at 12:55 PM Ash Berlin-Taylor <[email protected]> > wrote: > >> > >>> Why? Because AI “instructions” cannot be trusted. And I am after a > signal > >>> that people are blindly using LLMs without enough human introversion. > >>> > >>> Want a prime example? > >>> > >>> The pr triage skill. > >>> > >>> You already updated the instructions to not `@` the reviewer here > >>> > https://github.com/apache/airflow-steward/blob/76cfa5e1d2e682b88df5205e9cda396df51a66b6/skills/pr-management-triage/comment-templates.md#reviewer-mention-policy > >>> > >>>> When a comment's only addressee is the PR author (the > >>> request-author-confirmation, reviewer-ping author-primary, and > review-nudge > >>> author-primary templates), the body references the reviewer without > >>> @-mentioning them > >>> > >>> And yet the LLM did it again: > >>> https://github.com/apache/airflow/pull/66633#discussion_r3344849352 > >>> > >>>> @korex-f — A reviewer (@ashb) has requested changes on this PR, so > I've > >>> removed the ready for maintainer review label — the next step is on > your > >>> side. Could you address the review comments (push a fix, or reply > in-thread > >>> explaining why the feedback doesn't apply)? Once addressed, re-request > >>> review from @ashb or re-mark the PR ready and it returns to the > maintainer > >>> queue. Thank you. > >>> > >>> And frankly I’m tired of all this shit. > >>> > >>> I want to do anything and everything to reduce the drive by > contribution > >>> with no human activity. I’m happy to spend my time helping humans, but > if > >>> they are just going to feed that back to an LLM and burn an egregious > >>> amount of carbon: no thank you. > >>> > >>> -ash > >>> > >>> > >>>> On 9 Jun 2026, at 10:38, Jarek Potiuk <[email protected]> wrote: > >>>> > >>>> Hi Ash, Amogh, and Shahar, > >>>> > >>>> Ash, I'm curious to learn more about how the "brown m&m test" differs > >>> from > >>>> our current request for agents to identify themselves. Could you help > me > >>>> understand the flow and the specific benefits you see? It feels > similar > >>> to > >>>> me, but I'd love to hear your perspective in case I'm missing a > nuance. > >>>> > >>>> Regarding the gh pr create --web approach, we included those > instructions > >>>> to ensure we meet ASF legal guidelines for Gen-AI headers, and to > support > >>>> contributors who might not have Copilot. That said, if you have ideas > on > >>>> how to trim the context or improve the templates, we truly appreciate > PRs > >>>> that improve them—and many people already have. AGENTS.md is a team > >>> effort, > >>>> and we’re always looking for ways to make it better. Let's keep our > >>>> collaboration positive as we refine these processes together. > >>>> > >>>> Amogh and Shahar, yep the idea of an validatio step in the CI for > >>>> first-time contributions is something we should implement sooner or > >>> later. > >>>> I have actually been gathering stats on this for the last two weeks. > I’ve > >>>> been preparing to see how manually triggered triage tasks can turn > into > >>>> automated ones—I'm gathering stats on when human judgment is needed. I > >>>> shared some stats about this recently and will continue gathering > them. > >>> The > >>>> next step is discussing here what and how we can automate. > >>>> > >>>> Also, the current triage process already uses our Pull Request > criteria > >>> to > >>>> pre-classify the PRs and only marks them with "ready for maintainer > >>> review" > >>>> if those criteria are met. So, if there are any specific criteria > you’d > >>>> like to see added to our "Pull request criteria," PRs are most welcome > >>>> there as well. > >>>> > >>>> Best regards, > >>>> > >>>> Jarek > >>> > >>> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
