(airflow-steward) branch main updated: feat(pr-management-triage): add F6 pre-filter for maintainer co-drafted draft PRs (#90)

potiuk Thu, 07 May 2026 10:43:17 -0700

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git



The following commit(s) were added to refs/heads/main by this push:
     new b454e9b  feat(pr-management-triage): add F6 pre-filter for maintainer 
co-drafted draft PRs (#90)
b454e9b is described below

commit b454e9b254bc8acb69f4ebd3b2d009bd0a79f559
Author: André Ahlert <[email protected]>
AuthorDate: Thu May 7 14:42:46 2026 -0300

    feat(pr-management-triage): add F6 pre-filter for maintainer co-drafted 
draft PRs (#90)
    
    * feat(pr-management-triage): add F6 pre-filter for maintainer co-drafted 
PRs
    
    When a maintainer other than the viewer has substantively engaged with
    a draft PR via review or substantive comment and the author has not
    yet pushed in response, the classifier was duplicating the maintainer's
    work by proposing the same `draft` action a second time. F5a expires
    after 72 hours and only fires when the most recent comment is by a
    collaborator; F5b targets maintainer-to-maintainer pings; neither
    covers the "maintainer drafted with a critique" shape.
    
    F6 closes that gap. The signal is anchored at
    `commits(last:1).committedDate` so a new author push reopens the PR
    for re-evaluation, instead of relying on a wall-clock TTL that would
    re-introduce the duplicate-proposal problem on long-lived drafts.
    Drafts only — promotion to ready-for-review flips the semantics and
    F4 already covers the no-regression case there.
    
    Push-by-maintainer and convert_to_draft timeline-event signals are
    called out as follow-ups in `rationale.md`; both require new GraphQL
    fields and the historical motivating examples already surface via the
    review-or-comment signals F6 inspects.
    
    Closes apache/airflow-steward#79
    
    * fix(pr-management-triage): address F6 review feedback
    
    Three fixes from the code review on the F6 pre-filter:
    
    1. Stale-sweep carve-out. F6 now mirrors F3's explicit guarantee
       that stale-sweep classifications can still pull a PR back in;
       without this, an abandoned draft where the maintainer left a
       critique and the author never returned would be invisible to
       Sweep 1a/1b forever.
    
    2. Empty-body review guard. A review submitted with `state =
       COMMENTED` and an empty top-level body is the GitHub UI's way
       of attaching only inline thread comments; that signal is
       already handled by row 14/15 (unresolved threads) and should
       not also trip F6.
    
    3. Required GraphQL fields table now lists F6 alongside F5a/F5b,
       so a future editor cross-checking the table when adding a
       pre-filter sees the dependency.
    
    Also adds an empirical-validation section to rationale.md that
    walks the three motivating PRs from #79 against the F6 predicate.
    Two of the three (`<upstream>#63260`, `<upstream>#64906`) fire on
    the comment branch; the third (`<upstream>#58149`) does not,
    because the only post-commit collaborator engagement is the
    viewer's own comments — which is `already_triaged` territory
    (rows 3-5), not F6's. Documenting this honestly avoids over-
    claiming and points the next contributor at the right gap.
---
 .../pr-management-triage/classify-and-act.md       |  14 ++-
 .claude/skills/pr-management-triage/rationale.md   | 112 +++++++++++++++++++++
 2 files changed, 121 insertions(+), 5 deletions(-)

diff --git a/.claude/skills/pr-management-triage/classify-and-act.md 
b/.claude/skills/pr-management-triage/classify-and-act.md
index e6a4cb3..12a8499 100644
--- a/.claude/skills/pr-management-triage/classify-and-act.md
+++ b/.claude/skills/pr-management-triage/classify-and-act.md
@@ -47,11 +47,15 @@ filter is skipped silently from the main triage flow.
 | F4 | Already marked ready, no regression | `labels` contains `ready for 
maintainer review` AND CI green AND `mergeable != CONFLICTING` AND no 
unresolved threads. Regression (any of: CI red, new conflict, new unresolved 
thread *after* the label was applied) bypasses this filter — surface as a 
regressed-passing entry so the maintainer can decide whether to pull the label. 
|
 | F5a | Recent collaborator comment (author cooldown) | Most recent comment is 
by a `COLLABORATOR`/`MEMBER`/`OWNER`, `createdAt < 72h` ago, AND posted after 
`commits(last:1).committedDate`. |
 | F5b | Maintainer-to-maintainer ping unanswered | Most recent collaborator 
comment `@`-mentions one or more logins other than the PR author AND none of 
those mentioned logins have posted on the PR or in `latestReviews` after that 
comment. Team mentions (e.g. `@<upstream>-committers`) are conservatively 
treated as F5b matches. |
+| F6 | Maintainer co-drafted | `isDraft == true` AND any of: (a) 
`latestReviews` has a node with `authorAssociation ∈ {OWNER, MEMBER, 
COLLABORATOR}` AND `author.login ≠ <viewer>` AND `state ∈ {COMMENTED, 
CHANGES_REQUESTED, APPROVED}` AND `submittedAt > commits(last:1).committedDate` 
AND review body is non-empty (avoids the "review with only inline thread 
comments and an empty top-level body" false positive — those are already 
counted by row 14/15 unresolved-thread logic); (b) `comments(l [...]
 
-F5a and F5b override every signal in the decision table — they
-are not weighed against conflicts, failing CI, or unresolved
-threads. See 
[`rationale.md#pre-filter-5-active-maintainer-conversation`](rationale.md#pre-filter-5-active-maintainer-conversation)
 for the
-why.
+F5a, F5b, and F6 override every signal in the decision table —
+they are not weighed against conflicts, failing CI, or unresolved
+threads. See
+[`rationale.md#pre-filter-5-active-maintainer-conversation`](rationale.md#pre-filter-5-active-maintainer-conversation)
+and
+[`rationale.md#pre-filter-6-maintainer-co-drafted`](rationale.md#pre-filter-6-maintainer-co-drafted)
+for the why.
 
 ---
 
@@ -339,7 +343,7 @@ applies — rows do not get to reach back for more data.
 
 | Decision rows / preconditions | Required fields |
 |---|---|
-| F5a, F5b, grace periods | 
`comments(last:10).nodes.{author.login,authorAssociation,bodyText,createdAt}`, 
`latestReviews.nodes.{author.login,submittedAt}`, 
`commits(last:1).nodes.commit.committedDate` |
+| F5a, F5b, F6, grace periods | 
`comments(last:10).nodes.{author.login,authorAssociation,bodyText,createdAt}`, 
`latestReviews.nodes.{state,author.login,authorAssociation,submittedAt}`, 
`commits(last:1).nodes.commit.committedDate`, viewer login |
 | Row 1 + Real-CI guard | `statusCheckRollup.state`, 
`statusCheckRollup.contexts`, `authorAssociation`, `head_sha` (REST 
`action_required` index keyed by `head_sha`) |
 | `copilot_review_stale` (row 2) | 
`reviewThreads.nodes.{isResolved,comments.nodes.{author.login,createdAt,url}}`, 
`comments(last:10).nodes.{author.login,createdAt}` |
 | `has_deterministic_signal`, `ci_failures_only`, `unresolved_threads_only`, 
`unresolved_threads_only_likely_addressed` (rows 8–17) | `mergeable`, 
`statusCheckRollup.{state,contexts}`, 
`reviewThreads.nodes.{isResolved,comments(first:5).nodes.{author.login,authorAssociation,createdAt}}`,
 `updatedAt`, 
`comments(last:10).nodes.{author.login,authorAssociation,createdAt}`, 
`commits(last:1).nodes.commit.committedDate`, `author.login` |
diff --git a/.claude/skills/pr-management-triage/rationale.md 
b/.claude/skills/pr-management-triage/rationale.md
index d91eb53..ba1ce21 100644
--- a/.claude/skills/pr-management-triage/rationale.md
+++ b/.claude/skills/pr-management-triage/rationale.md
@@ -75,6 +75,118 @@ negative cost (talking over a real maintainer call-out).
 
 ---
 
+## Pre-filter 6 (maintainer co-drafted)
+
+F6 covers a different shape of "do not talk over a maintainer"
+than F5a/F5b: a draft PR that a maintainer (other than the
+viewer running the skill) has already substantively engaged
+with — typically by leaving a review with `CHANGES_REQUESTED`,
+posting a substantive comment outlining what is wrong, or both —
+and where the author has not yet responded with a new commit.
+
+The motivating cases are PRs the maintainer would already
+demote-and-comment on by hand and which the classifier, running
+later, would propose the same `draft` action for a second time.
+A few historical examples on `<upstream>` (each manually skipped
+to avoid a duplicate proposal):
+
+- `potiuk-drafted` `<upstream>#58149`
+- `vargacypher-drafted` `<upstream>#63260`
+- `bugraoz93-drafted` `<upstream>#64906`
+
+In each case the classifier wanted to emit a `deterministic_flag
+→ draft` proposal whose substance a different maintainer had
+already covered on the draft. F5a does not cover this: F5a
+expires after 72 hours and only fires when the **most recent**
+comment is by a collaborator. F5b does not cover this either:
+the maintainer engagement here is directed at the author, not at
+other maintainers. F6 closes the gap.
+
+### Empirical check against the motivating examples
+
+Snapshotting the three PRs at the time of the issue (`<now>` =
+2026-05-07):
+
+| PR | Last author commit | Qualifying maintainer engagement after that anchor 
| F6 fires (viewer = potiuk)? |
+|---|---|---|---|
+| `<upstream>#63260` | 2026-03-22 (vargacypher) | `kaxil` comment 2026-04-15, 
body length 80 chars | yes — comment branch |
+| `<upstream>#64906` | 2026-04-11 (stephen-bracken) | `bugraoz93` comment 
2026-04-20, body length 102 chars | yes — comment branch |
+| `<upstream>#58149` | 2026-01-23 (Philip Abernethy) | only `potiuk`'s own 
triage / closure comments | **no** — viewer-self engagement is excluded |
+
+`#58149` is therefore not addressed by F6 as written. It is a
+related-but-distinct concern: the viewer's *own* prior
+engagement on a draft is conceptually
+[`already_triaged`](classify-and-act.md#decision-table)
+territory (rows 3–5), and the duplicate-proposal symptom there
+is "viewer left a free-form drafting comment that doesn't carry
+the triage marker, so rows 3–5 don't recognise it." Closing
+that gap belongs in the marker-detection logic of rows 3–5 (or
+in the stale-sweep duplicate-suppression), not in F6 — F6's
+contract is "do not talk over a *different* maintainer." The
+issue tracking the marker-detection extension is
+[#79's first follow-up 
bullet](https://github.com/apache/airflow-steward/issues/79).
+
+### Why drafts only, not ready-for-review
+
+The F6 signal is "a maintainer is co-drafting this PR" — a
+collaborative state that only exists while the PR is `isDraft ==
+true`. Once a PR is promoted to ready-for-review the surrounding
+semantics flip: F4 already exempts ready PRs without regression,
+and a regression on a ready PR is exactly the moment we *want*
+the classifier to surface a signal to the maintainer queue, not
+suppress it. Limiting F6 to drafts keeps that signal path
+intact.
+
+### Why "after the last author commit", not a wall-clock TTL
+
+A wall-clock expiry (e.g. "ignore engagements older than 30
+days") would re-introduce the same duplicate-proposal problem
+F6 is meant to avoid: an old draft that a maintainer engaged
+with months ago, which the author has not pushed to since, is
+still in the same conversational state — re-proposing `draft`
+on it adds nothing. The right invalidation signal is **author
+activity**, not time: when the author pushes a new commit, the
+maintainer's prior critique may or may not be addressed and the
+classifier should re-evaluate. Anchoring F6 at
+`commits(last:1).committedDate` matches F5a's anchor and lets
+the eventual-resurfacing job stay where it belongs — the stale-
+sweep flow in [`stale-sweeps.md`](stale-sweeps.md), which acts
+on a different action (`stale_draft` → close), not on a
+duplicate `draft` proposal.
+
+### Why ≥ 80 chars on comments
+
+The threshold filters out emoji reactions, `+1`, `lgtm`, and
+single-sentence acknowledgements while letting through the
+typical maintainer critique ("Two things — the migration here
+needs a downgrade path, and the new helper duplicates X in
+`utils.py`. Converting to draft until those are addressed.").
+80 is empirical, not load-bearing; tune with feedback if real
+PRs slip through. Reviews are not gated on length because a
+review submission is itself a stronger commitment than a free-
+form comment — a maintainer who clicks through the review UI
+to attach `CHANGES_REQUESTED` has already engaged substantively
+even if the body is terse.
+
+### What F6 does not yet cover
+
+Two further signals would strengthen F6 but require new fields
+in the batch query:
+
+- A maintainer pushing commits to the author's branch (not
+  currently exposed — `commits(last:1)` returns committed-by
+  data but not the GitHub login of the pusher in a form the
+  query consumes).
+- A `convert_to_draft` timeline event authored by a maintainer
+  (would require pulling `timelineItems(itemTypes:
+  [CONVERT_TO_DRAFT_EVENT])`).
+
+Both are tracked as follow-ups; neither blocks the MVP because
+the historical examples that motivated F6 all surface via the
+review-or-comment signals F6 already inspects.
+
+---
+
 ## Row 1 — `pending_workflow_approval`
 
 Single most sensitive category in the skill. Approving a workflow

(airflow-steward) branch main updated: feat(pr-management-triage): add F6 pre-filter for maintainer co-drafted draft PRs (#90)

Reply via email to