Re: Re: [DISCUSS] PR auto-triage recent stats (how do we do more reviews)

Jarek Potiuk Fri, 22 May 2026 02:51:20 -0700

Good point. The idea here was to show not only the failures (which are in
CI) but also to give the author very clear instructions on where to look
for documentation explaining what to do and what the next steps are. It's
not only for human authors but also for agents - if somebody tells their
agent (if they still care), "fix #PR," the link to documentation explaining
the issue and ways to fix it narrows the context for the agent, making it
more likely that the issues will be fixed.


And I think some of the issues you pointed out might indeed be quite small,
but the others just a little. I think the biggest change I see can be done
quickly is to stop listing the failed job (as you said it's all there
already). Examples:

----

@Lougarou Converting to draft — this PR doesn't yet meet our Pull Request
quality criteria.

❌ Kubernetes tests — Failing: Kubernetes tests / K8S
System:LocalExecutor-3.10-v1.30.13-false, Kubernetes tests / K8S
System:KubernetesExecutor-3.10-v1.30.13-false, Kubernetes tests / K8S
System:CeleryExecutor-3.10-v1.30.13-true, Kubernetes tests / K8S
System:LocalExecutor-3.10-v1.30.13-true, Kubernetes tests / K8S
System:CeleryExecutor-3.10-v1.30.13-false (+1 more). See docs.
❌ Provider tests — Failing: Low dep tests: providers /
All-prov:LowestDeps:14:3.10:cncf.kubernetes, Low dep tests: providers /
All-prov:LowestDeps:14:3.10:amazon...apache.flink, Low dep tests: providers
/ All-prov:LowestDeps:14:3.10:google, Non-DB tests: providers /
Non-DB-prov::3.10:amazon...google, provider distributions tests / Compat
3.2.1:P3.10: (+4 more). See docs.
❌ Other failing CI checks — Failing: Sqlite tests: core /
DB-core:Sqlite:3.10:Always, Postgres tests: core /
DB-core:Postgres:14:3.10:Always, MySQL tests: core /
DB-core:MySQL:8.0:3.10:Always, MyPy providers checks. See docs.
❌ Pre-commit / static checks — Failing: CI image checks / Static checks.
See docs.
Note: Your branch is 207 commits behind main. Please rebase and push again
to get up-to-date CI results.

See the linked criteria for how to fix each item, then mark the PR "Ready
for review". This is not a rejection — just an invitation to bring the PR
up to standard. No rush.

Note: This comment was drafted by an AI-assisted triage tool and may
contain mistakes. Once you have addressed the points above, an Apache
Airflow maintainer — a real person — will take the next look at your PR. We
use this two-stage triage process so that our maintainers' limited time is
spent where it matters most: the conversation with you.

Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting
-----

That one will be **way** smaller if we do not list which jobs failed.

---

@Sanskar121543 A few things need addressing before review — see our Pull
Request quality criteria.

Issues found:

❌ Pre-commit / static checks: CI image checks / Static checks is failing.
Run prek run --from-ref main --stage pre-commit locally and fix anything
that flags. See the static-checks docs.
What to do next:

Push a fix for the static-check failure.
No rush — take your time. We appreciate your contribution and are happy to
wait for updates. If you have questions, feel free to ask on the Airflow
Slack.

Note: This comment was drafted by an AI-assisted triage tool and may
contain mistakes. Once you have addressed the points above, an Apache
Airflow maintainer — a real person — will take the next look at your PR. We
use this two-stage triage process so that our maintainers' limited time is
spent where it matters most: the conversation with you.

---


I think the rest of the message is quite important: link to our criteria,
for each failed check type link to appropriate docs, clear expectation what
to do next, note about AI contribution and mentioning that maintainer will
take look, link to our process.

While it is repetitive with other comments, we have to remember that for
many authors this is the first time they see it.

I am going to implement this "no jobs failed" thing now - but if there are
other ideas what else we could shorten - I am all ears :)

J.




On Fri, May 22, 2026 at 11:29 AM Omkar P <[email protected]> wrote:

> Not all, specific patterns repeating in some PRs:
>
> 1. Failing text can be reduced here (since individual failures are
> visible in gh ci checks):
> https://github.com/apache/airflow/pull/67074#issuecomment-4483645580
>
> 2. If there's just one issue, it could be a one-liner text:
> https://github.com/apache/airflow/pull/66648#issuecomment-4476156002
>
> 3. Tagging can be reduced I guess, both people are tagged twice:
> https://github.com/apache/airflow/pull/66141#issuecomment-4381023580
>
> 4. 2 autotriage's comments back to back here, may be the 1st comment can
> hint next date by which there'll be a follow-up by autotriage?
> https://github.com/apache/airflow/pull/65983#issuecomment-4476155513
>
> 5. Note comment with separate inline comments here, see if text could be
> reduced in note comment could be reduced (may be just saying "inline
> notes below"?):
> https://github.com/apache/airflow/pull/64966#pullrequestreview-4177130099
>
> 6. Some standard stuff like clicking resolve conversation could be part
> of Boring Cyborg's comment or link to our docs:
> https://github.com/apache/airflow/pull/64900#issuecomment-4300992855
>
> 7. Note comment here (my PR where I saw this first), most content is
> available in associated inline comments so this note comment could be
> leaner? Similar to #5 above but more duplicate text between note and
> inline comments:
> https://github.com/apache/airflow/pull/65423#pullrequestreview-4214977109
>
> Some of above are older PRs, autotriage skills may have updated after
> those comments so kindly ignore the ones where you feel things are
> already cleaned up or discussed. Hope this is useful, thanks.
>
> Regards, Omkar
>
> On Thu, May 21, 2026 at 10:16 PM Jarek Potiuk <[email protected]> wrote:
>
> > All of them or some specific oness? Some example links?
> >
> > On Thu, May 21, 2026 at 11:56 AM Omkar P <[email protected]> wrote:
> >
> > > Jarek, would it be possible to make autotriage AI PR comments less
> > > repetitive (and less verbose)?
> > >
> > > It adds inline comments and then there's also a huge summary comment.
> > > Some of the content is repetitive and I think the huge summary comment
> > > can be made more concise. Code suggestions can be retained but text
> > > around it could be reduced. If it's even a slightly less verbose it'll
> > > be easier to read and make necessary changes quick (in my opinion).
> > >
> > > Not sure if you noticed this already, thought to let you know. Thanks.
> > >
> > > Regards,
> > > Omkar
> > >
> > > On Tue, May 19, 2026 at 11:18 PM Jarek Potiuk <[email protected]>
> wrote:
> > >
> > > > Now, all is good.
> > > >
> > > > On Wed, May 20, 2026 at 12:11 AM Sameer Mesiah <[email protected]
> >
> > > > wrote:
> > > >
> > > > > How does this look now? I was creating new emails before. Now, I am
> > > > > replying in the same thread.
> > > > >
> > > > >
> > > > > On Wed, 20 May 2026 at 00:02, Jarek Potiuk <[email protected]>
> wrote:
> > > > >
> > > > > > Nope. Separate thread :)
> > > > > >
> > > > > > On Wed, May 20, 2026 at 12:00 AM Sameer Mesiah <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Okay. That is perfectly fair.
> > > > > > >
> > > > > > > Also, does this email look fine to you? I believe those
> previous
> > > > emails
> > > > > > may
> > > > > > > have looked wrong because I manually copied the thread title
> and
> > > sent
> > > > > the
> > > > > > > emails. This time I used the reply button so I believe it
> should
> > be
> > > > > fine
> > > > > > as
> > > > > > > I can see the previous replies now.
> > > > > > >
> > > > > > > On 2026/05/19 22:42:16 Jarek Potiuk wrote:
> > > > > > > > NOTE. Sameer, there is **something** wrong with The responses
> > of
> > > > > yours
> > > > > > > > (A few recent emails) regarding the mail setup and the
> > responses
> > > > are
> > > > > > not
> > > > > > > > ending
> > > > > > > > in the same thread in Gmail (they do in Ponymail), Likey
> > message
> > > > id /
> > > > > > > > thread id
> > > > > > > > is **lost somewhere** - not sure what setup you have but I
> > > > **guess**
> > > > > > the
> > > > > > > > email
> > > > > > > > You are subscribed to the devlist, and it forwards messages,
> > > losing
> > > > > the
> > > > > > > > thread id from
> > > > > > > > Gmail (which seems interesting because you also use Gmail).
> So
> > > > maybe
> > > > > > you
> > > > > > > > can take a look at any non-standard setting you have ;).
> > > > > > > >
> > > > > > > > In the meantime I am copying your message here (minus
> praises -
> > > > they
> > > > > > are
> > > > > > > > very nice but it's about the merit):
> > > > > > > >
> > > > > > > > > That being said, I’ve noticed that some PRs end up in a
> > “needs
> > > > > > > maintainer
> > > > > > > > consensus / architectural decision” state rather than having
> > > > concrete
> > > > > > > > author-actionable issues.
> > > > > > > >
> > > > > > > > > In those cases, the auto-triage agent can repeatedly
> surface
> > > > > > secondary
> > > > > > > > issues while missing the real blocker, which creates a
> slightly
> > > > > > > misleading
> > > > > > > > signal for contributors. I hit this on one of my Kubernetes
> PRs
> > > > where
> > > > > > the
> > > > > > > > underlying issue was really maintainer alignment rather than
> > > > > unresolved
> > > > > > > > implementation problems.
> > > > > > > >
> > > > > > > > > Maybe it would help to introduce a category like 'pending
> > > > > maintainer
> > > > > > > > consensus” (ormore general 'misc' category) so the tooling
> can
> > > > > > > distinguish
> > > > > > > > between contributor follow-up and PRs that are effectively
> > > waiting
> > > > on
> > > > > > > > reviewer direction.
> > > > > > > >
> > > > > > > > > I understand that with the volume of PRs nowadays, there is
> > > only
> > > > so
> > > > > > > much
> > > > > > > > that can be done and perhaps this has already been brought up
> > > > before.
> > > > > > But
> > > > > > > > the main pain point (or at least what I have personally
> > > > experienced)
> > > > > is
> > > > > > > > false negatives. This is more of an annoyance than a major
> > > blocker
> > > > > but
> > > > > > I
> > > > > > > > was just curious if something could be done on the tooling
> side
> > > to
> > > > > > > > alleviate this issue.
> > > > > > > >
> > > > > > > > Nope - nobody raised it yet, but I think it's a great
> feedback,
> > > > and I
> > > > > > > think
> > > > > > > > it can be easily addressed, Generally the triage process does
> > not
> > > > > touch
> > > > > > > > "Ready for maintainer review" PRs, unless they start failing
> > > > > > (Conflicts,
> > > > > > > > rebases etc. - in which case the "ready for maintainer
> review"
> > > > label
> > > > > is
> > > > > > > > removed
> > > > > > > > But the fix is simple: it should not be removed if there is a
> > > > > > discussion
> > > > > > > is
> > > > > > > > started on the merit of that PR - not on mechanical failures.
> > > > > > > >
> > > > > > > > Fix here:
> > > > > > > >
> > > > > > > > https://github.com/apache/airflow-steward/pull/232
> > > > > > > >
> > > > > > > > We will review it in "Magpie", merge and we upgrade
> > > > > > > > to the latest version before next triage.
> > > > > > > >
> > > > > > > > J.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, May 19, 2026 at 11:19 AM Jarek Potiuk <
> > [email protected]>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I have completed two PR triage sessions using the latest
> > > version
> > > > of
> > > > > > > > > "Magpie," which includes improved stats and charts: PR
> Stats
> > > > > > Dashboard
> > > > > > > (
> > > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://htmlpreview.github.io/?https://gist.githubusercontent.com/potiuk/d593b7773847e5d2f8638ad59d355842/raw/7125cc996a05e135e93dc26012816b83db1fad51/pr-stats-dashboard.html
> > > > > > > > > ).
> > > > > > > > >
> > > > > > > > > Observations:
> > > > > > > > >
> > > > > > > > > - AI Triage: The process is effective; "drive-by" PRs have
> > > > > decreased,
> > > > > > > and
> > > > > > > > > we now see a ~50% author response rate. Open/closed PR
> volume
> > > has
> > > > > > > > > stabilized at approximately 40 per day.
> > > > > > > > > - Review Queue: We have 154 "ready for review" PRs, over
> half
> > > of
> > > > > > which
> > > > > > > > > have no maintainer comments. This queue is growing quickly
> > > > despite
> > > > > > > > > automated "unlabeling" of PRs with conflicts or failing
> > tests.
> > > > > > > > > - Gaps: The "providers" and "task-sdk" areas lack the most
> > > > > coverage.
> > > > > > > > >
> > > > > > > > > Takeaways & Discussion Points:
> > > > > > > > >
> > > > > > > > > 1. AI triage successfully filters low-quality PRs, but we
> > need
> > > > more
> > > > > > > > > maintainers to conduct periodic reviews in their specific
> > > areas.
> > > > > > > > > 2. Reviews can be done manually via the "ready for review"
> > > label
> > > > or
> > > > > > > > > assisted by the agent using /setup-steward and
> > > > > > > /pr-management-code-review.
> > > > > > > > > 3. We need to revamp CODEOWNERS to clarify whether listing
> > > > implies
> > > > > > > > > observation or a commitment to review and to cover
> unassigned
> > > > > areas.
> > > > > > > > >
> > > > > > > > > I look forward to your thoughts on how we can improve these
> > > > > > processes.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Jarek Potiuk
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Re: [DISCUSS] PR auto-triage recent stats (how do we do more reviews)

Reply via email to