Re: Re: [DISCUSS] PR auto-triage recent stats (how do we do more reviews)

Jarek Potiuk Fri, 22 May 2026 04:15:39 -0700

https://github.com/apache/airflow/pull/67322 created for airflow for this
one - I am going to upstream it to Magpie once we test it live.


On Fri, May 22, 2026 at 11:49 AM Jarek Potiuk <[email protected]> wrote:

> Good point. The idea here was to show not only the failures (which are in
> CI) but also to give the author very clear instructions on where to look
> for documentation explaining what to do and what the next steps are. It's
> not only for human authors but also for agents - if somebody tells their
> agent (if they still care), "fix #PR," the link to documentation explaining
> the issue and ways to fix it narrows the context for the agent, making it
> more likely that the issues will be fixed.
>
> And I think some of the issues you pointed out might indeed be quite
> small, but the others just a little. I think the biggest change I see can
> be done quickly is to stop listing the failed job (as you said it's all
> there already). Examples:
>
> ----
>
> @Lougarou Converting to draft — this PR doesn't yet meet our Pull Request
> quality criteria.
>
> ❌ Kubernetes tests — Failing: Kubernetes tests / K8S
> System:LocalExecutor-3.10-v1.30.13-false, Kubernetes tests / K8S
> System:KubernetesExecutor-3.10-v1.30.13-false, Kubernetes tests / K8S
> System:CeleryExecutor-3.10-v1.30.13-true, Kubernetes tests / K8S
> System:LocalExecutor-3.10-v1.30.13-true, Kubernetes tests / K8S
> System:CeleryExecutor-3.10-v1.30.13-false (+1 more). See docs.
> ❌ Provider tests — Failing: Low dep tests: providers /
> All-prov:LowestDeps:14:3.10:cncf.kubernetes, Low dep tests: providers /
> All-prov:LowestDeps:14:3.10:amazon...apache.flink, Low dep tests: providers
> / All-prov:LowestDeps:14:3.10:google, Non-DB tests: providers /
> Non-DB-prov::3.10:amazon...google, provider distributions tests / Compat
> 3.2.1:P3.10: (+4 more). See docs.
> ❌ Other failing CI checks — Failing: Sqlite tests: core /
> DB-core:Sqlite:3.10:Always, Postgres tests: core /
> DB-core:Postgres:14:3.10:Always, MySQL tests: core /
> DB-core:MySQL:8.0:3.10:Always, MyPy providers checks. See docs.
> ❌ Pre-commit / static checks — Failing: CI image checks / Static checks.
> See docs.
> Note: Your branch is 207 commits behind main. Please rebase and push again
> to get up-to-date CI results.
>
> See the linked criteria for how to fix each item, then mark the PR "Ready
> for review". This is not a rejection — just an invitation to bring the PR
> up to standard. No rush.
>
> Note: This comment was drafted by an AI-assisted triage tool and may
> contain mistakes. Once you have addressed the points above, an Apache
> Airflow maintainer — a real person — will take the next look at your PR. We
> use this two-stage triage process so that our maintainers' limited time is
> spent where it matters most: the conversation with you.
>
> Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting
> -----
>
> That one will be **way** smaller if we do not list which jobs failed.
>
> ---
>
> @Sanskar121543 A few things need addressing before review — see our Pull
> Request quality criteria.
>
> Issues found:
>
> ❌ Pre-commit / static checks: CI image checks / Static checks is failing.
> Run prek run --from-ref main --stage pre-commit locally and fix anything
> that flags. See the static-checks docs.
> What to do next:
>
> Push a fix for the static-check failure.
> No rush — take your time. We appreciate your contribution and are happy to
> wait for updates. If you have questions, feel free to ask on the Airflow
> Slack.
>
> Note: This comment was drafted by an AI-assisted triage tool and may
> contain mistakes. Once you have addressed the points above, an Apache
> Airflow maintainer — a real person — will take the next look at your PR. We
> use this two-stage triage process so that our maintainers' limited time is
> spent where it matters most: the conversation with you.
>
> ---
>
>
> I think the rest of the message is quite important: link to our criteria,
> for each failed check type link to appropriate docs, clear expectation what
> to do next, note about AI contribution and mentioning that maintainer will
> take look, link to our process.
>
> While it is repetitive with other comments, we have to remember that for
> many authors this is the first time they see it.
>
> I am going to implement this "no jobs failed" thing now - but if there are
> other ideas what else we could shorten - I am all ears :)
>
> J.
>
>
>
>
> On Fri, May 22, 2026 at 11:29 AM Omkar P <[email protected]> wrote:
>
>> Not all, specific patterns repeating in some PRs:
>>
>> 1. Failing text can be reduced here (since individual failures are
>> visible in gh ci checks):
>> https://github.com/apache/airflow/pull/67074#issuecomment-4483645580
>>
>> 2. If there's just one issue, it could be a one-liner text:
>> https://github.com/apache/airflow/pull/66648#issuecomment-4476156002
>>
>> 3. Tagging can be reduced I guess, both people are tagged twice:
>> https://github.com/apache/airflow/pull/66141#issuecomment-4381023580
>>
>> 4. 2 autotriage's comments back to back here, may be the 1st comment can
>> hint next date by which there'll be a follow-up by autotriage?
>> https://github.com/apache/airflow/pull/65983#issuecomment-4476155513
>>
>> 5. Note comment with separate inline comments here, see if text could be
>> reduced in note comment could be reduced (may be just saying "inline
>> notes below"?):
>> https://github.com/apache/airflow/pull/64966#pullrequestreview-4177130099
>>
>> 6. Some standard stuff like clicking resolve conversation could be part
>> of Boring Cyborg's comment or link to our docs:
>> https://github.com/apache/airflow/pull/64900#issuecomment-4300992855
>>
>> 7. Note comment here (my PR where I saw this first), most content is
>> available in associated inline comments so this note comment could be
>> leaner? Similar to #5 above but more duplicate text between note and
>> inline comments:
>> https://github.com/apache/airflow/pull/65423#pullrequestreview-4214977109
>>
>> Some of above are older PRs, autotriage skills may have updated after
>> those comments so kindly ignore the ones where you feel things are
>> already cleaned up or discussed. Hope this is useful, thanks.
>>
>> Regards, Omkar
>>
>> On Thu, May 21, 2026 at 10:16 PM Jarek Potiuk <[email protected]> wrote:
>>
>> > All of them or some specific oness? Some example links?
>> >
>> > On Thu, May 21, 2026 at 11:56 AM Omkar P <[email protected]>
>> wrote:
>> >
>> > > Jarek, would it be possible to make autotriage AI PR comments less
>> > > repetitive (and less verbose)?
>> > >
>> > > It adds inline comments and then there's also a huge summary comment.
>> > > Some of the content is repetitive and I think the huge summary comment
>> > > can be made more concise. Code suggestions can be retained but text
>> > > around it could be reduced. If it's even a slightly less verbose it'll
>> > > be easier to read and make necessary changes quick (in my opinion).
>> > >
>> > > Not sure if you noticed this already, thought to let you know. Thanks.
>> > >
>> > > Regards,
>> > > Omkar
>> > >
>> > > On Tue, May 19, 2026 at 11:18 PM Jarek Potiuk <[email protected]>
>> wrote:
>> > >
>> > > > Now, all is good.
>> > > >
>> > > > On Wed, May 20, 2026 at 12:11 AM Sameer Mesiah <
>> [email protected]>
>> > > > wrote:
>> > > >
>> > > > > How does this look now? I was creating new emails before. Now, I
>> am
>> > > > > replying in the same thread.
>> > > > >
>> > > > >
>> > > > > On Wed, 20 May 2026 at 00:02, Jarek Potiuk <[email protected]>
>> wrote:
>> > > > >
>> > > > > > Nope. Separate thread :)
>> > > > > >
>> > > > > > On Wed, May 20, 2026 at 12:00 AM Sameer Mesiah <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Okay. That is perfectly fair.
>> > > > > > >
>> > > > > > > Also, does this email look fine to you? I believe those
>> previous
>> > > > emails
>> > > > > > may
>> > > > > > > have looked wrong because I manually copied the thread title
>> and
>> > > sent
>> > > > > the
>> > > > > > > emails. This time I used the reply button so I believe it
>> should
>> > be
>> > > > > fine
>> > > > > > as
>> > > > > > > I can see the previous replies now.
>> > > > > > >
>> > > > > > > On 2026/05/19 22:42:16 Jarek Potiuk wrote:
>> > > > > > > > NOTE. Sameer, there is **something** wrong with The
>> responses
>> > of
>> > > > > yours
>> > > > > > > > (A few recent emails) regarding the mail setup and the
>> > responses
>> > > > are
>> > > > > > not
>> > > > > > > > ending
>> > > > > > > > in the same thread in Gmail (they do in Ponymail), Likey
>> > message
>> > > > id /
>> > > > > > > > thread id
>> > > > > > > > is **lost somewhere** - not sure what setup you have but I
>> > > > **guess**
>> > > > > > the
>> > > > > > > > email
>> > > > > > > > You are subscribed to the devlist, and it forwards messages,
>> > > losing
>> > > > > the
>> > > > > > > > thread id from
>> > > > > > > > Gmail (which seems interesting because you also use Gmail).
>> So
>> > > > maybe
>> > > > > > you
>> > > > > > > > can take a look at any non-standard setting you have ;).
>> > > > > > > >
>> > > > > > > > In the meantime I am copying your message here (minus
>> praises -
>> > > > they
>> > > > > > are
>> > > > > > > > very nice but it's about the merit):
>> > > > > > > >
>> > > > > > > > > That being said, I’ve noticed that some PRs end up in a
>> > “needs
>> > > > > > > maintainer
>> > > > > > > > consensus / architectural decision” state rather than having
>> > > > concrete
>> > > > > > > > author-actionable issues.
>> > > > > > > >
>> > > > > > > > > In those cases, the auto-triage agent can repeatedly
>> surface
>> > > > > > secondary
>> > > > > > > > issues while missing the real blocker, which creates a
>> slightly
>> > > > > > > misleading
>> > > > > > > > signal for contributors. I hit this on one of my Kubernetes
>> PRs
>> > > > where
>> > > > > > the
>> > > > > > > > underlying issue was really maintainer alignment rather than
>> > > > > unresolved
>> > > > > > > > implementation problems.
>> > > > > > > >
>> > > > > > > > > Maybe it would help to introduce a category like 'pending
>> > > > > maintainer
>> > > > > > > > consensus” (ormore general 'misc' category) so the tooling
>> can
>> > > > > > > distinguish
>> > > > > > > > between contributor follow-up and PRs that are effectively
>> > > waiting
>> > > > on
>> > > > > > > > reviewer direction.
>> > > > > > > >
>> > > > > > > > > I understand that with the volume of PRs nowadays, there
>> is
>> > > only
>> > > > so
>> > > > > > > much
>> > > > > > > > that can be done and perhaps this has already been brought
>> up
>> > > > before.
>> > > > > > But
>> > > > > > > > the main pain point (or at least what I have personally
>> > > > experienced)
>> > > > > is
>> > > > > > > > false negatives. This is more of an annoyance than a major
>> > > blocker
>> > > > > but
>> > > > > > I
>> > > > > > > > was just curious if something could be done on the tooling
>> side
>> > > to
>> > > > > > > > alleviate this issue.
>> > > > > > > >
>> > > > > > > > Nope - nobody raised it yet, but I think it's a great
>> feedback,
>> > > > and I
>> > > > > > > think
>> > > > > > > > it can be easily addressed, Generally the triage process
>> does
>> > not
>> > > > > touch
>> > > > > > > > "Ready for maintainer review" PRs, unless they start failing
>> > > > > > (Conflicts,
>> > > > > > > > rebases etc. - in which case the "ready for maintainer
>> review"
>> > > > label
>> > > > > is
>> > > > > > > > removed
>> > > > > > > > But the fix is simple: it should not be removed if there is
>> a
>> > > > > > discussion
>> > > > > > > is
>> > > > > > > > started on the merit of that PR - not on mechanical
>> failures.
>> > > > > > > >
>> > > > > > > > Fix here:
>> > > > > > > >
>> > > > > > > > https://github.com/apache/airflow-steward/pull/232
>> > > > > > > >
>> > > > > > > > We will review it in "Magpie", merge and we upgrade
>> > > > > > > > to the latest version before next triage.
>> > > > > > > >
>> > > > > > > > J.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Tue, May 19, 2026 at 11:19 AM Jarek Potiuk <
>> > [email protected]>
>> > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi all,
>> > > > > > > > >
>> > > > > > > > > I have completed two PR triage sessions using the latest
>> > > version
>> > > > of
>> > > > > > > > > "Magpie," which includes improved stats and charts: PR
>> Stats
>> > > > > > Dashboard
>> > > > > > > (
>> > > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://htmlpreview.github.io/?https://gist.githubusercontent.com/potiuk/d593b7773847e5d2f8638ad59d355842/raw/7125cc996a05e135e93dc26012816b83db1fad51/pr-stats-dashboard.html
>> > > > > > > > > ).
>> > > > > > > > >
>> > > > > > > > > Observations:
>> > > > > > > > >
>> > > > > > > > > - AI Triage: The process is effective; "drive-by" PRs have
>> > > > > decreased,
>> > > > > > > and
>> > > > > > > > > we now see a ~50% author response rate. Open/closed PR
>> volume
>> > > has
>> > > > > > > > > stabilized at approximately 40 per day.
>> > > > > > > > > - Review Queue: We have 154 "ready for review" PRs, over
>> half
>> > > of
>> > > > > > which
>> > > > > > > > > have no maintainer comments. This queue is growing quickly
>> > > > despite
>> > > > > > > > > automated "unlabeling" of PRs with conflicts or failing
>> > tests.
>> > > > > > > > > - Gaps: The "providers" and "task-sdk" areas lack the most
>> > > > > coverage.
>> > > > > > > > >
>> > > > > > > > > Takeaways & Discussion Points:
>> > > > > > > > >
>> > > > > > > > > 1. AI triage successfully filters low-quality PRs, but we
>> > need
>> > > > more
>> > > > > > > > > maintainers to conduct periodic reviews in their specific
>> > > areas.
>> > > > > > > > > 2. Reviews can be done manually via the "ready for review"
>> > > label
>> > > > or
>> > > > > > > > > assisted by the agent using /setup-steward and
>> > > > > > > /pr-management-code-review.
>> > > > > > > > > 3. We need to revamp CODEOWNERS to clarify whether listing
>> > > > implies
>> > > > > > > > > observation or a commitment to review and to cover
>> unassigned
>> > > > > areas.
>> > > > > > > > >
>> > > > > > > > > I look forward to your thoughts on how we can improve
>> these
>> > > > > > processes.
>> > > > > > > > >
>> > > > > > > > > Thanks,
>> > > > > > > > > Jarek Potiuk
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: Re: [DISCUSS] PR auto-triage recent stats (how do we do more reviews)

Reply via email to