Re: Re: [DISCUSS] PR auto-triage recent stats (how do we do more reviews)

Omkar P Fri, 22 May 2026 08:47:09 -0700

Thanks for incorporating the feedback Jarek.

And good luck with the Magpie initiative! :)


Regards,
Omkar

On Fri, May 22, 2026 at 11:15 AM Jarek Potiuk <[email protected]> wrote:

> https://github.com/apache/airflow/pull/67322 created for airflow for this
> one - I am going to upstream it to Magpie once we test it live.
>
> On Fri, May 22, 2026 at 11:49 AM Jarek Potiuk <[email protected]> wrote:
>
> > Good point. The idea here was to show not only the failures (which are in
> > CI) but also to give the author very clear instructions on where to look
> > for documentation explaining what to do and what the next steps are. It's
> > not only for human authors but also for agents - if somebody tells their
> > agent (if they still care), "fix #PR," the link to documentation
> explaining
> > the issue and ways to fix it narrows the context for the agent, making it
> > more likely that the issues will be fixed.
> >
> > And I think some of the issues you pointed out might indeed be quite
> > small, but the others just a little. I think the biggest change I see can
> > be done quickly is to stop listing the failed job (as you said it's all
> > there already). Examples:
> >
> > ----
> >
> > @Lougarou Converting to draft — this PR doesn't yet meet our Pull Request
> > quality criteria.
> >
> > ❌ Kubernetes tests — Failing: Kubernetes tests / K8S
> > System:LocalExecutor-3.10-v1.30.13-false, Kubernetes tests / K8S
> > System:KubernetesExecutor-3.10-v1.30.13-false, Kubernetes tests / K8S
> > System:CeleryExecutor-3.10-v1.30.13-true, Kubernetes tests / K8S
> > System:LocalExecutor-3.10-v1.30.13-true, Kubernetes tests / K8S
> > System:CeleryExecutor-3.10-v1.30.13-false (+1 more). See docs.
> > ❌ Provider tests — Failing: Low dep tests: providers /
> > All-prov:LowestDeps:14:3.10:cncf.kubernetes, Low dep tests: providers /
> > All-prov:LowestDeps:14:3.10:amazon...apache.flink, Low dep tests:
> providers
> > / All-prov:LowestDeps:14:3.10:google, Non-DB tests: providers /
> > Non-DB-prov::3.10:amazon...google, provider distributions tests / Compat
> > 3.2.1:P3.10: (+4 more). See docs.
> > ❌ Other failing CI checks — Failing: Sqlite tests: core /
> > DB-core:Sqlite:3.10:Always, Postgres tests: core /
> > DB-core:Postgres:14:3.10:Always, MySQL tests: core /
> > DB-core:MySQL:8.0:3.10:Always, MyPy providers checks. See docs.
> > ❌ Pre-commit / static checks — Failing: CI image checks / Static checks.
> > See docs.
> > Note: Your branch is 207 commits behind main. Please rebase and push
> again
> > to get up-to-date CI results.
> >
> > See the linked criteria for how to fix each item, then mark the PR "Ready
> > for review". This is not a rejection — just an invitation to bring the PR
> > up to standard. No rush.
> >
> > Note: This comment was drafted by an AI-assisted triage tool and may
> > contain mistakes. Once you have addressed the points above, an Apache
> > Airflow maintainer — a real person — will take the next look at your PR.
> We
> > use this two-stage triage process so that our maintainers' limited time
> is
> > spent where it matters most: the conversation with you.
> >
> > Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting
> > -----
> >
> > That one will be **way** smaller if we do not list which jobs failed.
> >
> > ---
> >
> > @Sanskar121543 A few things need addressing before review — see our Pull
> > Request quality criteria.
> >
> > Issues found:
> >
> > ❌ Pre-commit / static checks: CI image checks / Static checks is failing.
> > Run prek run --from-ref main --stage pre-commit locally and fix anything
> > that flags. See the static-checks docs.
> > What to do next:
> >
> > Push a fix for the static-check failure.
> > No rush — take your time. We appreciate your contribution and are happy
> to
> > wait for updates. If you have questions, feel free to ask on the Airflow
> > Slack.
> >
> > Note: This comment was drafted by an AI-assisted triage tool and may
> > contain mistakes. Once you have addressed the points above, an Apache
> > Airflow maintainer — a real person — will take the next look at your PR.
> We
> > use this two-stage triage process so that our maintainers' limited time
> is
> > spent where it matters most: the conversation with you.
> >
> > ---
> >
> >
> > I think the rest of the message is quite important: link to our criteria,
> > for each failed check type link to appropriate docs, clear expectation
> what
> > to do next, note about AI contribution and mentioning that maintainer
> will
> > take look, link to our process.
> >
> > While it is repetitive with other comments, we have to remember that for
> > many authors this is the first time they see it.
> >
> > I am going to implement this "no jobs failed" thing now - but if there
> are
> > other ideas what else we could shorten - I am all ears :)
> >
> > J.
> >
> >
> >
> >
> > On Fri, May 22, 2026 at 11:29 AM Omkar P <[email protected]> wrote:
> >
> >> Not all, specific patterns repeating in some PRs:
> >>
> >> 1. Failing text can be reduced here (since individual failures are
> >> visible in gh ci checks):
> >> https://github.com/apache/airflow/pull/67074#issuecomment-4483645580
> >>
> >> 2. If there's just one issue, it could be a one-liner text:
> >> https://github.com/apache/airflow/pull/66648#issuecomment-4476156002
> >>
> >> 3. Tagging can be reduced I guess, both people are tagged twice:
> >> https://github.com/apache/airflow/pull/66141#issuecomment-4381023580
> >>
> >> 4. 2 autotriage's comments back to back here, may be the 1st comment can
> >> hint next date by which there'll be a follow-up by autotriage?
> >> https://github.com/apache/airflow/pull/65983#issuecomment-4476155513
> >>
> >> 5. Note comment with separate inline comments here, see if text could be
> >> reduced in note comment could be reduced (may be just saying "inline
> >> notes below"?):
> >>
> https://github.com/apache/airflow/pull/64966#pullrequestreview-4177130099
> >>
> >> 6. Some standard stuff like clicking resolve conversation could be part
> >> of Boring Cyborg's comment or link to our docs:
> >> https://github.com/apache/airflow/pull/64900#issuecomment-4300992855
> >>
> >> 7. Note comment here (my PR where I saw this first), most content is
> >> available in associated inline comments so this note comment could be
> >> leaner? Similar to #5 above but more duplicate text between note and
> >> inline comments:
> >>
> https://github.com/apache/airflow/pull/65423#pullrequestreview-4214977109
> >>
> >> Some of above are older PRs, autotriage skills may have updated after
> >> those comments so kindly ignore the ones where you feel things are
> >> already cleaned up or discussed. Hope this is useful, thanks.
> >>
> >> Regards, Omkar
> >>
> >> On Thu, May 21, 2026 at 10:16 PM Jarek Potiuk <[email protected]> wrote:
> >>
> >> > All of them or some specific oness? Some example links?
> >> >
> >> > On Thu, May 21, 2026 at 11:56 AM Omkar P <[email protected]>
> >> wrote:
> >> >
> >> > > Jarek, would it be possible to make autotriage AI PR comments less
> >> > > repetitive (and less verbose)?
> >> > >
> >> > > It adds inline comments and then there's also a huge summary
> comment.
> >> > > Some of the content is repetitive and I think the huge summary
> comment
> >> > > can be made more concise. Code suggestions can be retained but text
> >> > > around it could be reduced. If it's even a slightly less verbose
> it'll
> >> > > be easier to read and make necessary changes quick (in my opinion).
> >> > >
> >> > > Not sure if you noticed this already, thought to let you know.
> Thanks.
> >> > >
> >> > > Regards,
> >> > > Omkar
> >> > >
> >> > > On Tue, May 19, 2026 at 11:18 PM Jarek Potiuk <[email protected]>
> >> wrote:
> >> > >
> >> > > > Now, all is good.
> >> > > >
> >> > > > On Wed, May 20, 2026 at 12:11 AM Sameer Mesiah <
> >> [email protected]>
> >> > > > wrote:
> >> > > >
> >> > > > > How does this look now? I was creating new emails before. Now, I
> >> am
> >> > > > > replying in the same thread.
> >> > > > >
> >> > > > >
> >> > > > > On Wed, 20 May 2026 at 00:02, Jarek Potiuk <[email protected]>
> >> wrote:
> >> > > > >
> >> > > > > > Nope. Separate thread :)
> >> > > > > >
> >> > > > > > On Wed, May 20, 2026 at 12:00 AM Sameer Mesiah <
> >> > [email protected]
> >> > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Okay. That is perfectly fair.
> >> > > > > > >
> >> > > > > > > Also, does this email look fine to you? I believe those
> >> previous
> >> > > > emails
> >> > > > > > may
> >> > > > > > > have looked wrong because I manually copied the thread title
> >> and
> >> > > sent
> >> > > > > the
> >> > > > > > > emails. This time I used the reply button so I believe it
> >> should
> >> > be
> >> > > > > fine
> >> > > > > > as
> >> > > > > > > I can see the previous replies now.
> >> > > > > > >
> >> > > > > > > On 2026/05/19 22:42:16 Jarek Potiuk wrote:
> >> > > > > > > > NOTE. Sameer, there is **something** wrong with The
> >> responses
> >> > of
> >> > > > > yours
> >> > > > > > > > (A few recent emails) regarding the mail setup and the
> >> > responses
> >> > > > are
> >> > > > > > not
> >> > > > > > > > ending
> >> > > > > > > > in the same thread in Gmail (they do in Ponymail), Likey
> >> > message
> >> > > > id /
> >> > > > > > > > thread id
> >> > > > > > > > is **lost somewhere** - not sure what setup you have but I
> >> > > > **guess**
> >> > > > > > the
> >> > > > > > > > email
> >> > > > > > > > You are subscribed to the devlist, and it forwards
> messages,
> >> > > losing
> >> > > > > the
> >> > > > > > > > thread id from
> >> > > > > > > > Gmail (which seems interesting because you also use
> Gmail).
> >> So
> >> > > > maybe
> >> > > > > > you
> >> > > > > > > > can take a look at any non-standard setting you have ;).
> >> > > > > > > >
> >> > > > > > > > In the meantime I am copying your message here (minus
> >> praises -
> >> > > > they
> >> > > > > > are
> >> > > > > > > > very nice but it's about the merit):
> >> > > > > > > >
> >> > > > > > > > > That being said, I’ve noticed that some PRs end up in a
> >> > “needs
> >> > > > > > > maintainer
> >> > > > > > > > consensus / architectural decision” state rather than
> having
> >> > > > concrete
> >> > > > > > > > author-actionable issues.
> >> > > > > > > >
> >> > > > > > > > > In those cases, the auto-triage agent can repeatedly
> >> surface
> >> > > > > > secondary
> >> > > > > > > > issues while missing the real blocker, which creates a
> >> slightly
> >> > > > > > > misleading
> >> > > > > > > > signal for contributors. I hit this on one of my
> Kubernetes
> >> PRs
> >> > > > where
> >> > > > > > the
> >> > > > > > > > underlying issue was really maintainer alignment rather
> than
> >> > > > > unresolved
> >> > > > > > > > implementation problems.
> >> > > > > > > >
> >> > > > > > > > > Maybe it would help to introduce a category like
> 'pending
> >> > > > > maintainer
> >> > > > > > > > consensus” (ormore general 'misc' category) so the tooling
> >> can
> >> > > > > > > distinguish
> >> > > > > > > > between contributor follow-up and PRs that are effectively
> >> > > waiting
> >> > > > on
> >> > > > > > > > reviewer direction.
> >> > > > > > > >
> >> > > > > > > > > I understand that with the volume of PRs nowadays, there
> >> is
> >> > > only
> >> > > > so
> >> > > > > > > much
> >> > > > > > > > that can be done and perhaps this has already been brought
> >> up
> >> > > > before.
> >> > > > > > But
> >> > > > > > > > the main pain point (or at least what I have personally
> >> > > > experienced)
> >> > > > > is
> >> > > > > > > > false negatives. This is more of an annoyance than a major
> >> > > blocker
> >> > > > > but
> >> > > > > > I
> >> > > > > > > > was just curious if something could be done on the tooling
> >> side
> >> > > to
> >> > > > > > > > alleviate this issue.
> >> > > > > > > >
> >> > > > > > > > Nope - nobody raised it yet, but I think it's a great
> >> feedback,
> >> > > > and I
> >> > > > > > > think
> >> > > > > > > > it can be easily addressed, Generally the triage process
> >> does
> >> > not
> >> > > > > touch
> >> > > > > > > > "Ready for maintainer review" PRs, unless they start
> failing
> >> > > > > > (Conflicts,
> >> > > > > > > > rebases etc. - in which case the "ready for maintainer
> >> review"
> >> > > > label
> >> > > > > is
> >> > > > > > > > removed
> >> > > > > > > > But the fix is simple: it should not be removed if there
> is
> >> a
> >> > > > > > discussion
> >> > > > > > > is
> >> > > > > > > > started on the merit of that PR - not on mechanical
> >> failures.
> >> > > > > > > >
> >> > > > > > > > Fix here:
> >> > > > > > > >
> >> > > > > > > > https://github.com/apache/airflow-steward/pull/232
> >> > > > > > > >
> >> > > > > > > > We will review it in "Magpie", merge and we upgrade
> >> > > > > > > > to the latest version before next triage.
> >> > > > > > > >
> >> > > > > > > > J.
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Tue, May 19, 2026 at 11:19 AM Jarek Potiuk <
> >> > [email protected]>
> >> > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi all,
> >> > > > > > > > >
> >> > > > > > > > > I have completed two PR triage sessions using the latest
> >> > > version
> >> > > > of
> >> > > > > > > > > "Magpie," which includes improved stats and charts: PR
> >> Stats
> >> > > > > > Dashboard
> >> > > > > > > (
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://htmlpreview.github.io/?https://gist.githubusercontent.com/potiuk/d593b7773847e5d2f8638ad59d355842/raw/7125cc996a05e135e93dc26012816b83db1fad51/pr-stats-dashboard.html
> >> > > > > > > > > ).
> >> > > > > > > > >
> >> > > > > > > > > Observations:
> >> > > > > > > > >
> >> > > > > > > > > - AI Triage: The process is effective; "drive-by" PRs
> have
> >> > > > > decreased,
> >> > > > > > > and
> >> > > > > > > > > we now see a ~50% author response rate. Open/closed PR
> >> volume
> >> > > has
> >> > > > > > > > > stabilized at approximately 40 per day.
> >> > > > > > > > > - Review Queue: We have 154 "ready for review" PRs, over
> >> half
> >> > > of
> >> > > > > > which
> >> > > > > > > > > have no maintainer comments. This queue is growing
> quickly
> >> > > > despite
> >> > > > > > > > > automated "unlabeling" of PRs with conflicts or failing
> >> > tests.
> >> > > > > > > > > - Gaps: The "providers" and "task-sdk" areas lack the
> most
> >> > > > > coverage.
> >> > > > > > > > >
> >> > > > > > > > > Takeaways & Discussion Points:
> >> > > > > > > > >
> >> > > > > > > > > 1. AI triage successfully filters low-quality PRs, but
> we
> >> > need
> >> > > > more
> >> > > > > > > > > maintainers to conduct periodic reviews in their
> specific
> >> > > areas.
> >> > > > > > > > > 2. Reviews can be done manually via the "ready for
> review"
> >> > > label
> >> > > > or
> >> > > > > > > > > assisted by the agent using /setup-steward and
> >> > > > > > > /pr-management-code-review.
> >> > > > > > > > > 3. We need to revamp CODEOWNERS to clarify whether
> listing
> >> > > > implies
> >> > > > > > > > > observation or a commitment to review and to cover
> >> unassigned
> >> > > > > areas.
> >> > > > > > > > >
> >> > > > > > > > > I look forward to your thoughts on how we can improve
> >> these
> >> > > > > > processes.
> >> > > > > > > > >
> >> > > > > > > > > Thanks,
> >> > > > > > > > > Jarek Potiuk
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: Re: [DISCUSS] PR auto-triage recent stats (how do we do more reviews)

Reply via email to