You can absolutely add the  option to use any agent or model to the tool I
created. Currently it can use copilot, Claude, codex - but you can add PR
to use any model - it is build for that purpose.

This is integrated with breeze uatctually even automatically stores which
model you use and continue using it. The interface to LLm ia super Simple.
It does not even use Pydantic AI - it just generates prompt and parses the
output. so by all means - adding a way to use any other LLM.

90% of the work done by the tool is deterministic; it only asks the LLM
when it is in doubt.

So - by all means, PRs to use any other LLMs - whether local or remote -
are most welcome. Also we can add opencode and ollama integration

[image: image.png]

J.

On Wed, Mar 11, 2026, 03:32 Kevin Yang <[email protected]> wrote:

> Hi Jarek,
>
> Thank you very much for all the efforts in building the solutions. I
> recently also read through the following discussions [1,2,3], and think
> about whether there is a good approach on tackling the challenge.
>
> I believe integrating with LLM is a good approach, especially can leverage
> its reasoning capabilities to provide a better triage. Existing products
> such as Copilot Code Review can also provide insightful triage as
> previously proposed by Kaxil.
>
> I also find another direction that also looks promising to me is to
> use a *small
> language model (SLM)*, a model with 2-4 B parameters, which can be run on
> standard Github runners, using CPU-only, to triage issues and PRs. I've
> built a github action *SLM Triage* (
> https://github.com/marketplace/actions/slm-triage).
>
> What advantages does SLM offer?
> * It can be run on a standard GitHub runner, on CPU, and finish execution
> in around 3 - 5 minutes
> * There is no API cost, billing set up with LLM service
> * It runs on GitHub events, when an issue or PR is opened, and capable to
> triage issues as long as there are GitHub runners available
> * It can be simply integrated into GitHub Actions without infrastructure,
> or local setup.
>
> What are the current limitations?
> * It doesn't have enough domain knowledge about a specific codebase, so it
> can only triage based on high-level context, and relevancy between context
> information and code changes
> * It has limited reasoning capability
> * It has limited context window (128k context window size, some might have
> ~256k)
>
> Why I think it can be a potential direction
> * I feel some issues or PRs can be triage based on some basic heuristics
> and rules
> * Even though context window is limited, if the process is triggered when
> issue opened, the context window is good enough to capture issue
> description, pr description, and even code change
> * It is easier to set up for a broader open-source community, and probably
> more cost efficient, it can scale based on workflow adoption
> * It can take action through API such as comment on an issue, add label,
> close an issue or PR, etc. based on the triage result.
>
> I also attempted to triage multiple issues and PRs on airflow repository,
> and check the actual issues/PRs (I created a script to dry-run and inspect
> the triage result and reasoning). The result looks promising, but sometimes
> I found it is "a bit strict" and needs some improvements in terms of
> prompting.
>
> I wonder if this is a valid idea, but it would be great if the idea can
> potentially help.
>
> Thanks,
> Kevin Yang
>
> [1] https://github.com/orgs/community/discussions/185387
> [2] https://github.com/ossf/wg-vulnerability-disclosures/issues/178
> [3]
>
> https://www.reddit.com/r/opensource/comments/1q3f89b/open_source_is_being_ddosed_by_ai_slop_and_github/#:~:text=FunBrilliant5713-,Open%20source%20is%20being%20DDoSed%20by%20AI%20slop%20and%20GitHub,which%20submissions%20came%20from%20Copilot
> .
>
> On Tue, Mar 10, 2026 at 9:13 PM Jarek Potiuk <[email protected]> wrote:
>
> > Just to update everyone: I've auto-triaged a bunch of PRs—the tool works
> > very well IMHO, but we will know after the authors see them and review
> >
> > Some stats (I will gather more in the next days as I am adding timing and
> > further improvements):
> >
> > * I triaged about 100 PRs in under an hour of elapsed time (I
> > also corrected, improved and noted some fixes, so it will be faster)
> > * I converted 30 of those into Drafts and closed a few
> > * I have not marked any as ready to review yet, but I will do that
> tomorrow
> > * The LLM (Claude) assessment is quite fast - faster than I thought.
> > Parallelizing it also helps. LLM assessment takes between 20 s and 2
> > minutes (elapsed), but usually, only a few pull requests (15% or less)
> are
> > LLM assessed  in a batch, so this is not a bottleneck. I will also modify
> > the tool to start reviewing deterministic things before LLMs complete -
> > which should speed up the whole process even more
> > * The LLM assessments are pretty good - but a few were significantly
> wrong
> > and I would not post them. It's good we have Human-In-The-Loop and in the
> > driver's seat.
> >
> > Overall - I think the tool is doing very well what I wanted. But let's
> see
> > the improvements over the next few days, observe how authors react, and
> > determine if it can actually help maintainers
> >
> > I added a few PRs as improvements; looking forward to reviews, :
> >
> > * https://github.com/apache/airflow/pull/63318
> > * https://github.com/apache/airflow/pull/63317
> > * https://github.com/apache/airflow/pull/63315
> > * https://github.com/apache/airflow/pull/63319
> > * https://github.com/apache/airflow/pull/63320
> >
> > J.
> >
> >
> >
> > On Tue, Mar 10, 2026 at 10:18 PM Jarek Potiuk <[email protected]> wrote:
> >
> > > Lazy consensus reached. I will try it out tonight. I added more signals
> > > (unresolved review comments)  and filtering options (
> > > https://github.com/apache/airflow/pull/63300) that will be useful
> during
> > > this phase.
> > >
> > > On Fri, Mar 6, 2026 at 9:08 PM Jarek Potiuk <[email protected]> wrote:
> > >
> > >> Hello here,
> > >>
> > >> I am asking a lazy consensus on the approach proposed in
> > >> https://lists.apache.org/thread/ly6lrm2gc4p7p54vomr8621nmb1pvlsk
> > >> regarding our approach to triaging PRs.
> > >>
> > >> The lazy consensus will last till  Tuesday 10 pm CEST (
> > >>
> >
> https://www.timeanddate.com/countdown/generic?iso=20260310T22&p0=262&font=cursive
> > >> )
> > >>
> > >> Summary of the proposal
> > >>
> > >> This is the proposed update to the PR contributing guidelines:
> > >>
> > >> > Start with **Draft**: Until you are sure that your PR passes all the
> > >> quality checks and tests, keep it in **Draft** status. This will
> signal
> > to
> > >> maintainers that the PR is not yet ready for review and it will
> prevent
> > >> maintainers from accidentally merging it before it's ready. Once you
> are
> > >> sure that your PR is ready for review, you can mark it as "Ready for
> > >> review" in the GitHub UI. Our regular check will convert all PRs from
> > >> non-collaborators that do not pass our quality gates to Draft status,
> > so if
> > >> you see that your PR is in Draft status and you haven't set it to
> Draft.
> > >> Check the comments to see what needs to be fixed.
> > >>
> > >> That's a "broad" description of the process; details will be worked
> out
> > >> while testing the solution.
> > >>
> > >> The PR: https://github.com/apache/airflow/pull/62682
> > >>
> > >> My testing approach is to start with individual areas, update and
> > perfect
> > >> the tool, gradually increase the reach of it and engage others - then
> we
> > >> might think about more regular process involving more maintainers.
> > >>
> > >> J.
> > >>
> > >
> >
>

Reply via email to