As long as this doesn’t turn into a human replacement situation, I think
this is really useful and is a great idea!

On Thu, Mar 12, 2026 at 8:57 AM Jarek Potiuk <[email protected]> wrote:

> Indeed -  but this does not reflect the final numbers yet. It also misses
> the fact that 200 of those are already draft (I converted a 100 of them are
> least) and each of them has detailed instructions for the authors what to
> do. At least 10 people thanked me for providing such detailed and guidance.
>
>
> Some 30-40 so far are `ready for maintainer review` (label). But I have not
> run the tool on all of the open PRs only on mine, dev-area and providers.
>
> i am still iterating and I proving the tool (with Yeongook help - he also
> run some of the triage - not only me.
>
> There are still some things to complete - I implemented a very strong
> security layer to completely sandbox and isolate the LLMs and prevent PR
> prompt injection attacks (those are real things nowadays)
> https://github.com/apache/airflow/pull/63422
>
> And I am working on a separate `review` mode - which will allow maintainers
> equally efficiently review already good PRs (the ones marked with 'ready
> for maintainer review` label  With the same approach - deterministic checks
> for speed + very targeted LLM assistance - but keep that human in the loop
> and maintainers in the driving seat. No comment, no message, no assessment
> posted to the contributor without conscious decision of the maintainer.
>
> I am looking at responses of people and have some small improvements on the
> way.
>
> I am also implementing some of the small workflows I see as current
> patterns in reviews of others - and I hope by early next week I will have a
> completely working and battle tested solution.
>
> I think that with the tool we will be able to handle easily (and I am not
> exaggerating at all) at the very least two orders of magnitude more PR
> traffic that we see right now - especially when more of us start using it
> and when we share the triage/review burden (very, very low for triage
> part). among more maintainers.
>
> I was hoping to demo it today at dev call - but I did not realize I am
> getting back to Warsaw from Slovakia today - and it is unlikely I will be
> able to share demo  - istll might be on/off at the call but in demoable
> circumstances likely - I might try but It's not likely - but I will create
> a detailed description of the tool, how to use it and proposed process and
> will record a screencast (likely weekend) demoing how it works and will
> share with everyone.
>
> I am super optimistic that wei will be able to solve the PR problem this
> way, and that we will be able to apply similar approach to issues and later
> also security reports. Smartly combining humans as driver's, deterministic
> (though AI-generated) code + LLMs as the additional 'intelligent
> assistant's for things that cannot be done deterministically seems to be
> working beautifully.
>
> J
>
> On Thu, Mar 12, 2026, 16:28 Vincent Beck <[email protected]> wrote:
>
> > Pretty impressive results, we were at 500+ open PRs 2 days ago and now we
> > are at ~430 open PRs. Bravo!
> >
> > On 2026/03/11 14:51:36 Kevin Yang wrote:
> > > Thanks for the feedback! More than happy if I could implement these
> > options
> > > and integrations. I will look into the current implementation and draft
> > PRs
> > > by the upcoming week.
> > >
> > > Best,
> > > Kevin Yang
> > >
> > > On Wed, Mar 11, 2026 at 4:05 AM Jarek Potiuk <[email protected]> wrote:
> > >
> > > > You can absolutely add the  option to use any agent or model to the
> > tool I
> > > > created. Currently it can use copilot, Claude, codex - but you can
> add
> > PR
> > > > to use any model - it is build for that purpose.
> > > >
> > > > This is integrated with breeze uatctually even automatically stores
> > which
> > > > model you use and continue using it. The interface to LLm ia super
> > Simple.
> > > > It does not even use Pydantic AI - it just generates prompt and
> parses
> > the
> > > > output. so by all means - adding a way to use any other LLM.
> > > >
> > > > 90% of the work done by the tool is deterministic; it only asks the
> LLM
> > > > when it is in doubt.
> > > >
> > > > So - by all means, PRs to use any other LLMs - whether local or
> remote
> > -
> > > > are most welcome. Also we can add opencode and ollama integration
> > > >
> > > > [image: image.png]
> > > >
> > > > J.
> > > >
> > > > On Wed, Mar 11, 2026, 03:32 Kevin Yang <[email protected]>
> wrote:
> > > >
> > > >> Hi Jarek,
> > > >>
> > > >> Thank you very much for all the efforts in building the solutions. I
> > > >> recently also read through the following discussions [1,2,3], and
> > think
> > > >> about whether there is a good approach on tackling the challenge.
> > > >>
> > > >> I believe integrating with LLM is a good approach, especially can
> > leverage
> > > >> its reasoning capabilities to provide a better triage. Existing
> > products
> > > >> such as Copilot Code Review can also provide insightful triage as
> > > >> previously proposed by Kaxil.
> > > >>
> > > >> I also find another direction that also looks promising to me is to
> > > >> use a *small
> > > >> language model (SLM)*, a model with 2-4 B parameters, which can be
> > run on
> > > >> standard Github runners, using CPU-only, to triage issues and PRs.
> > I've
> > > >> built a github action *SLM Triage* (
> > > >> https://github.com/marketplace/actions/slm-triage).
> > > >>
> > > >> What advantages does SLM offer?
> > > >> * It can be run on a standard GitHub runner, on CPU, and finish
> > execution
> > > >> in around 3 - 5 minutes
> > > >> * There is no API cost, billing set up with LLM service
> > > >> * It runs on GitHub events, when an issue or PR is opened, and
> > capable to
> > > >> triage issues as long as there are GitHub runners available
> > > >> * It can be simply integrated into GitHub Actions without
> > infrastructure,
> > > >> or local setup.
> > > >>
> > > >> What are the current limitations?
> > > >> * It doesn't have enough domain knowledge about a specific codebase,
> > so it
> > > >> can only triage based on high-level context, and relevancy between
> > context
> > > >> information and code changes
> > > >> * It has limited reasoning capability
> > > >> * It has limited context window (128k context window size, some
> might
> > have
> > > >> ~256k)
> > > >>
> > > >> Why I think it can be a potential direction
> > > >> * I feel some issues or PRs can be triage based on some basic
> > heuristics
> > > >> and rules
> > > >> * Even though context window is limited, if the process is triggered
> > when
> > > >> issue opened, the context window is good enough to capture issue
> > > >> description, pr description, and even code change
> > > >> * It is easier to set up for a broader open-source community, and
> > probably
> > > >> more cost efficient, it can scale based on workflow adoption
> > > >> * It can take action through API such as comment on an issue, add
> > label,
> > > >> close an issue or PR, etc. based on the triage result.
> > > >>
> > > >> I also attempted to triage multiple issues and PRs on airflow
> > repository,
> > > >> and check the actual issues/PRs (I created a script to dry-run and
> > inspect
> > > >> the triage result and reasoning). The result looks promising, but
> > > >> sometimes
> > > >> I found it is "a bit strict" and needs some improvements in terms of
> > > >> prompting.
> > > >>
> > > >> I wonder if this is a valid idea, but it would be great if the idea
> > can
> > > >> potentially help.
> > > >>
> > > >> Thanks,
> > > >> Kevin Yang
> > > >>
> > > >> [1] https://github.com/orgs/community/discussions/185387
> > > >> [2] https://github.com/ossf/wg-vulnerability-disclosures/issues/178
> > > >> [3]
> > > >>
> > > >>
> >
> https://www.reddit.com/r/opensource/comments/1q3f89b/open_source_is_being_ddosed_by_ai_slop_and_github/#:~:text=FunBrilliant5713-,Open%20source%20is%20being%20DDoSed%20by%20AI%20slop%20and%20GitHub,which%20submissions%20came%20from%20Copilot
> > > >> .
> > > >>
> > > >> On Tue, Mar 10, 2026 at 9:13 PM Jarek Potiuk <[email protected]>
> > wrote:
> > > >>
> > > >> > Just to update everyone: I've auto-triaged a bunch of PRs—the tool
> > works
> > > >> > very well IMHO, but we will know after the authors see them and
> > review
> > > >> >
> > > >> > Some stats (I will gather more in the next days as I am adding
> > timing
> > > >> and
> > > >> > further improvements):
> > > >> >
> > > >> > * I triaged about 100 PRs in under an hour of elapsed time (I
> > > >> > also corrected, improved and noted some fixes, so it will be
> faster)
> > > >> > * I converted 30 of those into Drafts and closed a few
> > > >> > * I have not marked any as ready to review yet, but I will do that
> > > >> tomorrow
> > > >> > * The LLM (Claude) assessment is quite fast - faster than I
> thought.
> > > >> > Parallelizing it also helps. LLM assessment takes between 20 s
> and 2
> > > >> > minutes (elapsed), but usually, only a few pull requests (15% or
> > less)
> > > >> are
> > > >> > LLM assessed  in a batch, so this is not a bottleneck. I will also
> > > >> modify
> > > >> > the tool to start reviewing deterministic things before LLMs
> > complete -
> > > >> > which should speed up the whole process even more
> > > >> > * The LLM assessments are pretty good - but a few were
> significantly
> > > >> wrong
> > > >> > and I would not post them. It's good we have Human-In-The-Loop and
> > in
> > > >> the
> > > >> > driver's seat.
> > > >> >
> > > >> > Overall - I think the tool is doing very well what I wanted. But
> > let's
> > > >> see
> > > >> > the improvements over the next few days, observe how authors
> react,
> > and
> > > >> > determine if it can actually help maintainers
> > > >> >
> > > >> > I added a few PRs as improvements; looking forward to reviews, :
> > > >> >
> > > >> > * https://github.com/apache/airflow/pull/63318
> > > >> > * https://github.com/apache/airflow/pull/63317
> > > >> > * https://github.com/apache/airflow/pull/63315
> > > >> > * https://github.com/apache/airflow/pull/63319
> > > >> > * https://github.com/apache/airflow/pull/63320
> > > >> >
> > > >> > J.
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Tue, Mar 10, 2026 at 10:18 PM Jarek Potiuk <[email protected]>
> > wrote:
> > > >> >
> > > >> > > Lazy consensus reached. I will try it out tonight. I added more
> > > >> signals
> > > >> > > (unresolved review comments)  and filtering options (
> > > >> > > https://github.com/apache/airflow/pull/63300) that will be
> useful
> > > >> during
> > > >> > > this phase.
> > > >> > >
> > > >> > > On Fri, Mar 6, 2026 at 9:08 PM Jarek Potiuk <[email protected]>
> > wrote:
> > > >> > >
> > > >> > >> Hello here,
> > > >> > >>
> > > >> > >> I am asking a lazy consensus on the approach proposed in
> > > >> > >>
> https://lists.apache.org/thread/ly6lrm2gc4p7p54vomr8621nmb1pvlsk
> > > >> > >> regarding our approach to triaging PRs.
> > > >> > >>
> > > >> > >> The lazy consensus will last till  Tuesday 10 pm CEST (
> > > >> > >>
> > > >> >
> > > >>
> >
> https://www.timeanddate.com/countdown/generic?iso=20260310T22&p0=262&font=cursive
> > > >> > >> )
> > > >> > >>
> > > >> > >> Summary of the proposal
> > > >> > >>
> > > >> > >> This is the proposed update to the PR contributing guidelines:
> > > >> > >>
> > > >> > >> > Start with **Draft**: Until you are sure that your PR passes
> > all
> > > >> the
> > > >> > >> quality checks and tests, keep it in **Draft** status. This
> will
> > > >> signal
> > > >> > to
> > > >> > >> maintainers that the PR is not yet ready for review and it will
> > > >> prevent
> > > >> > >> maintainers from accidentally merging it before it's ready.
> Once
> > you
> > > >> are
> > > >> > >> sure that your PR is ready for review, you can mark it as
> "Ready
> > for
> > > >> > >> review" in the GitHub UI. Our regular check will convert all
> PRs
> > from
> > > >> > >> non-collaborators that do not pass our quality gates to Draft
> > status,
> > > >> > so if
> > > >> > >> you see that your PR is in Draft status and you haven't set it
> to
> > > >> Draft.
> > > >> > >> Check the comments to see what needs to be fixed.
> > > >> > >>
> > > >> > >> That's a "broad" description of the process; details will be
> > worked
> > > >> out
> > > >> > >> while testing the solution.
> > > >> > >>
> > > >> > >> The PR: https://github.com/apache/airflow/pull/62682
> > > >> > >>
> > > >> > >> My testing approach is to start with individual areas, update
> and
> > > >> > perfect
> > > >> > >> the tool, gradually increase the reach of it and engage others
> -
> > > >> then we
> > > >> > >> might think about more regular process involving more
> > maintainers.
> > > >> > >>
> > > >> > >> J.
> > > >> > >>
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
>

Reply via email to