(airflow-steward) branch main updated: fix(pr-management-stats): refine is_untriaged predicate + surface draft/non-draft split (#219)

potiuk Mon, 18 May 2026 11:01:28 -0700

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git



The following commit(s) were added to refs/heads/main by this push:
     new 0d95e0a  fix(pr-management-stats): refine is_untriaged predicate + 
surface draft/non-draft split (#219)
0d95e0a is described below

commit 0d95e0a89359811ea3f3bf03dfb3780a144179a6
Author: Jarek Potiuk <[email protected]>
AuthorDate: Mon May 18 19:59:35 2026 +0200

    fix(pr-management-stats): refine is_untriaged predicate + surface 
draft/non-draft split (#219)
    
    * fix(pr-management-stats): refine `is_untriaged` predicate + surface 
draft/non-draft split
    
    When running the stats dashboard against `apache/airflow` (~475 open PRs) 
the
    "Untriaged non-drafts" hero card surfaced 266 PRs / 73 >4w, which read as a
    much larger gap than reality. Breaking the figure down by author association
    and label showed:
    
    - 26 (35%) were committer-authored — F1-skipped by `pr-management-triage`, 
not
      actionable, but counted toward the hero card.
    - 44 (60%) were already labelled `ready for maintainer review` — they had
      cleared the triage bar by an alternate path (direct reviewer promotion,
      pre-skill-era triage, etc.) but lacked the literal `Pull Request quality
      criteria` marker the classifier searches for, so they double-counted as
      "untriaged" while also appearing in the "Ready for review" hero card.
    
    The true actionable backlog was ~38 contributor PRs, not 73. The dashboard's
    recommendation rules and health rating fired on the inflated count and 
produced
    advice that didn't match the real state.
    
    This change refines the predicate the three downstream uses depend on:
    
    - `classify.md`: introduces the explicit `is_untriaged` predicate combining 
the
      existing triage-marker scan with the two missing exclusions (collaborator-
      authored and ready-labelled).
    - `aggregate.md`: adds `untriaged_nondraft` / `untriaged_old` / 
`untriaged_med`
      per-area counters that consume the refined predicate, plus `total_drafts` 
/
      `total_non_drafts` whole-repo counters for the dashboard's new 
draft/non-draft
      split. Updates the health-rating thresholds to read from the refined 
counters.
    - `render.md`: enhances the "Open PRs" hero card with a two-line sub-label —
      `<non-draft> non-draft · <draft> draft` on line 1, 
contributor/collaborator
      split on line 2 — so the maintainer can see at a glance how much of the 
queue
      is in-review vs. still being worked on. Updates the "Untriaged non-drafts"
      hero card spec to reference the refined predicate and explains why both
      exclusions matter.
    
    The pressure-score function in `classify.md#pressure-weight` already gave
    collaborator-authored PRs weight 0 and ready-labelled PRs a fixed weight of 
1,
    so the area-pressure ranking was already correct — only the global 
"untriaged"
    counters needed the predicate refinement.
    
    No GraphQL or fetch shape changes; this is a pure classification + 
documentation
    refinement.
    
    * feat(pr-management-stats): broader engagement model + per-triager weekly 
stats
    
    Builds on the previous commit. Adds the following to the spec:
    
    1. Bot exclusion in `is_untriaged` — `is_bot()` predicate filters
       dependabot/renovate/github-actions and any `*[bot]` login.
    2. New `is_engaged` predicate — broader "any maintainer touched this PR"
       that catches review-thread comments / direct discussion / label-adds the
       strict marker test misses. On `apache/airflow`'s ~457-PR human-authored
       open queue: strict marker-triaged covers ~21%; broad is_engaged covers
       ~59%. The 38-pp gap is "de-facto triaged" — engagement happening
       invisibly to the strict counter.
    3. New `is_defacto_triaged` and `is_ai_triaged` predicates — surfaced as
       separate hero-row cards (de-facto-triaged for queue-health insight,
       AI-triaged for skill-throughput accounting).
    4. Bot PRs as a separate accounting category. Bot-authored PRs follow their
       own lifecycle (Dependabot, Renovate, etc.) and shouldn't merge into the
       `contributors` count. New `bot_authored` counter surfaced on the
       expanded hero row.
    5. Two-row hero grid — the 4 original cards plus a new row of 4
       (Strict-triaged, De-facto triaged, AI-triaged, Bot PRs) so the
       maintainer can see the full breakdown at a glance.
    6. Per-triager weekly stats panel — top 15 maintainers ranked by distinct
       PRs engaged across the last 6 weeks. Each row shows a per-week
       AI/manual split so the team can see how much triage throughput goes
       through the skill vs. through direct review.
    
    The strict `is_triaged` definition remains the source of truth for the
    action-related flows (`pr-management-triage` row 3-4, sweep 1a) — only
    the stats categorisation gets the broader engagement model.
    
    No fetch shape changes — every new counter consumes data already populated
    by the existing open-PR query (comments(last:25) + review threads).
    
    Validation: applied against `apache/airflow` (457 human-authored open PRs).
    Strict-triaged=98, engaged=272, defacto_triaged=174, untriaged=85 (vs the
    266 the previous commit had reported because of the missing bot exclusion).
    
    * docs(pr-management-stats): explicit triage-tier definitions + tighten 
is_untriaged
    
    Adds a "Triage tiers — definitions" section at the top of `classify.md` that
    defines all four tiers (strict / de-facto / AI-triaged / untriaged) side-by-
    side with their predicates, implication chains, and dashboard surface area.
    
    Also calls out what the "literal marker" is — the substring `Pull Request
    quality criteria` — since the previous version of the spec referenced it
    without ever defining it explicitly. The marker is the visible link text in
    the canonical triage-comment template, scanned in comment `body` (not
    `bodyText`, to catch HTML-comment legacy form too).
    
    The key behavioural change in this commit:
    
    `is_untriaged` now uses `NOT is_engaged` instead of `NOT is_triaged`.
    The previous definition over-counted: PRs where a maintainer left detailed
    inline review feedback without including the canonical marker link were
    flagged as untriaged, even though they had been substantively engaged.
    The new predicate correctly counts only PRs with **zero** maintainer
    engagement.
    
    Concrete impact on `<upstream>`'s queue at fix time:
    - strict-untriaged (old): 72 non-drafts, 3 >4w
    - broad-untriaged (new): 47 non-drafts, 0 >4w
    - 25 PRs (35% of the strict count) drop out because they're de-facto-engaged
    
    The dashboard's "Untriaged non-drafts" hero card now surfaces the broad
    count, which is the number a maintainer can meaningfully act on. The
    de-facto-triaged hero card surfaces the 25 PRs separately as the strict-
    counter gap, so the underlying signal is still visible.
    
    Also includes a new invariant in `aggregate.md`:
    
        engaged + untriaged_nondraft + ready_for_review == non_drafts 
(contributor)
    
    The three sets now partition the contributor non-draft pool exactly.
    
    No fetch shape changes; no breaking changes for adopters.
    
    * docs(pr-management-stats): rename "Strict-triaged" hero card to "Quality 
Criteria triaged"
    
    The user-facing label was "Strict-triaged" which described the algorithm but
    not what the count actually means. "Quality Criteria triaged" matches the
    literal substring the detector scans for (`Pull Request quality criteria` —
    the link text in the canonical triage-comment template) and reads as a
    description of what happened to the PR ("a maintainer linked the quality
    criteria") rather than a description of how the classifier works
    ("strict" predicate).
    
    Renames the hero card label in `render.md` from **Strict-triaged** to
    **Quality Criteria triaged**. Updates the side-by-side tier table in
    `classify.md` and the invariants in `aggregate.md` to use the same name.
    
    The internal predicate name remains `is_triaged` — only the user-facing
    display label changes. The algorithm description in prose ("the strict
    marker-only test", "the strict-only count") stays the same; "strict" is
    still an accurate adjective for the implementation, just not the card name.
    
    No fetch shape changes, no behavioural changes.
---
 .claude/skills/pr-management-stats/aggregate.md | 115 +++++++++++++-
 .claude/skills/pr-management-stats/classify.md  | 193 ++++++++++++++++++++++++
 .claude/skills/pr-management-stats/render.md    |  98 +++++++++++-
 3 files changed, 396 insertions(+), 10 deletions(-)

diff --git a/.claude/skills/pr-management-stats/aggregate.md 
b/.claude/skills/pr-management-stats/aggregate.md
index 1f729d2..c5b3cb9 100644
--- a/.claude/skills/pr-management-stats/aggregate.md
+++ b/.claude/skills/pr-management-stats/aggregate.md
@@ -27,21 +27,42 @@ One `_AreaStats` block per area. Only two counters (`total` 
and `contributors`)
 | Field | Count rule | Scope |
 |---|---|---|
 | `total` | count of PRs in the area | **all** — reference only |
+| `total_drafts` | `isDraft == true` | **all** — denominator for the dashboard 
draft/non-draft split |
+| `total_non_drafts` | `isDraft == false` | **all** — paired with 
`total_drafts` |
 | `contributors` | `authorAssociation NOT IN (OWNER, MEMBER, COLLABORATOR)` | 
**all** — denominator for the contributor-scoped counters below |
 | `drafts` | `isDraft == true` | contributor-only |
 | `non_drafts` | `isDraft == false` | contributor-only |
 | `triaged_waiting` | classified `triaged_waiting` (see `classify.md`) | 
contributor-only |
 | `triaged_responded` | classified `triaged_responded` | contributor-only |
 | `ready_for_review` | label `ready for maintainer review` present | 
contributor-only |
+| `engaged` | satisfies 
[`is_engaged`](classify.md#is_engaged--de-facto-triaged) (any maintainer 
touched it) | contributor-only |
+| `defacto_triaged` | satisfies 
[`is_defacto_triaged`](classify.md#is_engaged--de-facto-triaged) (engaged but 
no marker) | contributor-only |
+| `ai_triaged` | satisfies 
[`is_ai_triaged`](classify.md#is_ai_triaged--ai-assisted-triage) (received an 
AI-assisted triage comment) | contributor-only |
+| `bot_authored` | 
[`is_bot`](classify.md#is_bot--author-is-a-recognised-bot)(pr.author.login) | 
**all** — its own category, NOT in `contributors` |
+| `untriaged_nondraft` | satisfies 
[`is_untriaged`](classify.md#is_untriaged--broad-untriaged) AND `isDraft == 
false` | contributor-only |
+| `untriaged_old` | `untriaged_nondraft` AND `age_bucket == ">4w"` | 
contributor-only |
+| `untriaged_med` | `untriaged_nondraft` AND `age_bucket == "1-4w"` | 
contributor-only |
 | `triager_drafted` | classified `drafted_by_triager` | contributor-only |
 | `age_buckets` | histogram, key = bucket label from `classify.md#age-bucket` 
| contributor-only |
 | `draft_age_buckets` | histogram over PRs where `drafted_at` is set, same 
bucket labels | contributor-only |
 
+The three `untriaged_*` counters share the same predicate (see
+[`classify.md#is_untriaged--broad-untriaged`](classify.md#is_untriaged--broad-untriaged))
+— a PR carrying the `ready for maintainer review` label is **not** counted as
+untriaged regardless of whether the literal triage marker is present, because
+the label itself is evidence that the PR cleared the triage bar.
+
 ### Invariants
 
+- `total == total_drafts + total_non_drafts` (every PR is exactly one)
+- `contributors + collaborator_authored + bot_authored == total` (three 
disjoint author classes)
 - `contributors == drafts + non_drafts` (each contributor PR is one or the 
other)
 - `triaged_waiting + triaged_responded <= contributors`
+- `triaged_waiting + triaged_responded <= engaged` (Quality-Criteria-triaged 
is a subset of engaged)
+- `defacto_triaged + (triaged_waiting + triaged_responded) == engaged` (every 
engaged PR is either Quality-Criteria-triaged or de-facto-only)
 - `ready_for_review <= non_drafts` (a ready PR shouldn't be draft — if the 
inequality fails, the label is stale; surface a one-line warning but don't 
correct the data)
+- `untriaged_old + untriaged_med <= untriaged_nondraft <= non_drafts`
+- `engaged + untriaged_nondraft + ready_for_review == non_drafts 
(contributor)` (the partition: every contributor non-draft is exactly one of 
`is_engaged` (strict OR de-facto), `is_untriaged` (no maintainer touched), or 
already-`ready` labelled — the three are mutually exclusive once the 
`is_untriaged` definition uses `NOT is_engaged` rather than `NOT is_triaged`)
 - `sum(age_buckets.values()) == contributors`
 - `contributors <= total`
 
@@ -258,17 +279,107 @@ This panel makes the *quality* of closures visible — the 
velocity panel says "
 
 ---
 
+## Triager activity (per-triager, per-week)
+
+The dashboard's "Triager activity" section ranks maintainers by how many
+distinct PRs each one **engaged with** in each of the last 6 calendar weeks.
+"Engaged" uses the same predicate as
+[`is_engaged`](classify.md#is_engaged--de-facto-triaged)
+(any maintainer comment / review).
+
+For each open PR currently in the fetch set, walk
+`pr.comments(last:25).nodes` and the parallel review-thread sub-comments:
+
+```text
+for each comment c by login L where authorAssociation IN
+    (OWNER, MEMBER, COLLABORATOR) AND NOT is_bot(L):
+    bucket = which_week(c.createdAt)   # see #weekly-velocity for bucket math
+    if bucket is one of the 6 windows:
+        kind = "ai" if "AI-assisted triage tool" in c.body else "manual"
+        triager_weekly[L][bucket][kind] += (first-engagement-in-bucket counter)
+```
+
+Per-week counting rule: count **at most one PR per (login, week, kind)** even
+when the same maintainer posts multiple comments in the same week of the same
+kind. The intent is "how many distinct PRs did this maintainer touch each week,
+split by AI-assisted vs manually-typed engagement" not "how many comments did
+they post"; multiple comments by the same person in the same week on the same
+PR of the same kind collapse to a single tally.
+
+### AI vs manual split
+
+Every triager's per-week count is further split into:
+
+- **AI-assisted**: comments whose body contains the AI-attribution footer
+  substring (`AI-assisted triage tool` — same detector as
+  [`is_ai_triaged`](classify.md#is_ai_triaged--ai-assisted-triage)).
+- **Manual**: comments without the footer.
+
+Same PR can contribute to *both* sub-counts for the same maintainer-week if
+they posted both an AI-drafted comment and a manually-typed comment in the
+same window. The dashboard shows the two side-by-side so the maintainer team
+can see how much of each person's throughput is coming through the skill
+versus through direct review.
+
+A maintainer's total for a week is `ai + manual`; both sub-counts use the
+per-PR de-duplication rule (one PR per kind per week).
+
+Counted as engagement:
+
+- Issue-level comments (`pr.comments(last:25)` in the standard fetch).
+- Review-thread comments — walk every
+  `pr.reviewThreads.nodes.comments(first:5)` for the same maintainer test.
+- `LabeledEvent` adding the `ready for maintainer review` label by a
+  maintainer (counts the label-add as the engagement timestamp). This catches
+  reviewers who applied the label without leaving a comment.
+
+The fetch shape in [`fetch.md`](fetch.md) already populates issue-level
+comments and review threads in the open-PR query. The label-add timestamp
+needs the `ready-label-timeline` query
+[(see `fetch.md#ready-label-timeline`)](fetch.md#ready-label-timeline) which
+the dashboard's "Ready-for-review trend by top areas" panel already fires —
+re-use that result.
+
+### Top-N rendering rule
+
+The "Triager activity" panel renders the top 15 maintainers by total PRs
+engaged across the 6-week window. The table layout (one row per triager,
+six week columns + a total column + a sparkline / mini-bars column) is
+defined in [`render.md#triager-activity-panel`](render.md). Maintainers with
+zero engagement in the window are excluded from the panel.
+
+### Caveats
+
+- Engagement is **at-most-one-PR-per-week-per-maintainer-per-PR**, not
+  comment-count. The maintainer who triages 20 PRs in one week and the
+  maintainer who comments 20 times on one PR show as 20 and 1 respectively.
+- The same maintainer can show up in multiple weeks for the same PR if they
+  engaged in each window — that is by design (re-engagement signals
+  continued attention).
+- Bot-account comments are excluded via the same
+  [`is_bot`](classify.md#is_bot--author-is-a-recognised-bot) test used in the
+  open-PR counters; copilot review-bots that post under a `CONTRIBUTOR`
+  association are not counted anyway (the maintainer check filters them).
+
+---
+
 ## Health rating
 
 Top-of-dashboard hero card. Computed as a count of fired threshold conditions:
 
 | Condition | Issue points |
 |---|---|
-| Any contributor non-draft PR untriaged AND > 4 weeks old | **2** |
-| > 30 contributor non-draft PRs untriaged AND in 1–4 weeks bucket | **1** |
+| `untriaged_old > 0` — any contributor non-draft `is_untriaged` PR > 4 weeks 
old | **2** |
+| `untriaged_med > 30` — > 30 contributor non-draft `is_untriaged` PRs in 1–4 
weeks bucket | **1** |
 | > 100 PRs labelled `ready for maintainer review` | **1** |
 | > 20 stale-triaged drafts (drafts where triage comment ≥ 7 days old AND no 
author response) | **1** |
 
+Each "untriaged" condition above uses the refined
+[`is_untriaged`](classify.md#is_untriaged--broad-untriaged) predicate —
+so a PR carrying the `ready for maintainer review` label does **not** trigger 
the
+health-rating points even if it lacks the literal `Pull Request quality 
criteria`
+marker (the label is itself evidence of triage).
+
 Sum the points and map:
 
 | Total points | Label | Colour |
diff --git a/.claude/skills/pr-management-stats/classify.md 
b/.claude/skills/pr-management-stats/classify.md
index f1164af..47be4ac 100644
--- a/.claude/skills/pr-management-stats/classify.md
+++ b/.claude/skills/pr-management-stats/classify.md
@@ -9,6 +9,159 @@ Classification is pure function of state from 
[`fetch.md`](fetch.md) — no netw
 
 ---
 
+## Triage tiers — definitions
+
+The dashboard surfaces four distinct triage-state categories side-by-side
+because the strict marker-only definition under-counts actual triage activity
+and conflates "no maintainer touched this" with "no maintainer left the
+canonical template." All four predicates apply to a single PR; they overlap
+deliberately (see the implication chains below).
+
+### `is_triaged` — *Quality-Criteria-triaged*
+
+```text
+is_triaged(pr) :=
+    EXISTS comment c IN pr.comments
+      WHERE c.authorAssociation IN (OWNER, MEMBER, COLLABORATOR)
+        AND c.body CONTAINS "Pull Request quality criteria"
+        AND (c.createdAt > head_commit.committedDate
+             OR head_commit.committedDate > c.createdAt)   # see [Triage 
marker](#triage-marker)
+```
+
+**What the literal marker is:** the **substring `Pull Request quality
+criteria`** — this is the visible link text in the canonical triage-comment
+template that every `pr-management-triage` action body carries (see
+[`pr-management-triage/comment-templates.md`](../pr-management-triage/comment-templates.md)).
+The classifier scans every comment's `body` (NOT `bodyText` — the latter
+strips HTML comments, see [Both marker forms count](#both-marker-forms-count)
+below) for the exact substring. The string is also accepted in an HTML-comment
+form left by the legacy `breeze pr auto-triage` command.
+
+The marker is a **single point of failure** — rename the link text in the
+comment template and this detector silently stops counting. Adopters can
+customise the URL the link points to via
+[`<project-config>/pr-management-triage-comment-templates.md`](../../../projects/_template/pr-management-triage-comment-templates.md)'s
+`quality_criteria_url`, but the link **text** must remain `Pull Request
+quality criteria` verbatim.
+
+### `is_engaged` — *de-facto triaged*
+
+```text
+is_engaged(pr) :=
+    EXISTS comment c IN pr.comments
+      WHERE c.authorAssociation IN (OWNER, MEMBER, COLLABORATOR)
+        AND NOT is_bot(c.author.login)
+```
+
+Plus the same predicate applied to `pr.latestReviews` (any maintainer review)
+and to `LabeledEvent` (any maintainer who added the `ready for maintainer
+review` label).
+
+The broader "a maintainer touched this PR at some point" definition. Catches
+review-thread comments, design discussions, label-adds, hand-typed feedback,
+and so on — all the engagement modes that don't include the literal marker
+substring. **`is_engaged` is a superset of `is_triaged`**: every
+Quality-Criteria-triaged PR is also engaged, but not vice-versa.
+
+The terminology in the dashboard:
+
+- **De-facto triaged** = engaged but not Quality-Criteria-triaged.
+  Formally: `is_engaged(pr) AND NOT is_triaged(pr)` — the
+  `defacto_triaged` counter aggregates this set.
+
+### `is_ai_triaged` — *AI-assisted triage*
+
+```text
+is_ai_triaged(pr) :=
+    EXISTS comment c IN pr.comments
+      WHERE c.authorAssociation IN (OWNER, MEMBER, COLLABORATOR)
+        AND c.body CONTAINS "AI-assisted triage tool"
+```
+
+A PR received at least one maintainer comment whose body contains the
+**AI-attribution footer substring** (`AI-assisted triage tool`). Every
+`pr-management-triage` comment template (draft / comment / ping /
+request-author-confirmation / close-comment etc.) ends with this footer
+verbatim — so the detector counts any PR whose triage included a skill-drafted
+comment.
+
+This predicate is **independent of** `is_triaged` and `is_engaged`:
+
+- An AI-drafted draft+comment includes the `Pull Request quality criteria`
+  link → both `is_triaged` and `is_ai_triaged` fire.
+- An AI-drafted **ping** or **request-author-confirmation** uses a different
+  template that does NOT include the criteria link → `is_engaged` and
+  `is_ai_triaged` fire but `is_triaged` does NOT.
+- A maintainer's hand-typed comment in the criteria-template form (rare —
+  this happens when a maintainer manually pastes the link text without the
+  skill) → `is_triaged` and `is_engaged` fire but `is_ai_triaged` does NOT.
+
+The implication chains:
+
+```text
+is_triaged(pr)    ⇒ is_engaged(pr)         # marker requires maintainer 
comment, which requires engagement
+is_ai_triaged(pr) ⇒ is_engaged(pr)         # AI footer requires maintainer 
comment, which requires engagement
+is_triaged(pr)    ⇎ is_ai_triaged(pr)      # independent — see cases above
+```
+
+### `is_untriaged` — *broad untriaged*
+
+```text
+is_untriaged(pr) :=
+    NOT is_engaged(pr)                                # broadest — no 
maintainer touched it
+    AND author_association NOT IN (OWNER, MEMBER, COLLABORATOR)
+    AND NOT is_bot(pr.author.login)
+    AND `ready for maintainer review` NOT IN labels(pr)
+```
+
+**Key change from earlier iterations:** uses `NOT is_engaged` (not `NOT
+is_triaged`). Combining strict-untriaged with de-facto-triaged double-counted
+PRs that maintainers had touched: a strict-only definition flags a PR as
+untriaged even when a maintainer left detailed inline review feedback, just
+because the feedback didn't include the canonical marker string. The broader
+predicate correctly counts only PRs with **zero maintainer engagement**.
+
+Concrete impact on a large `<upstream>` queue: the strict-only count gave
+72 untriaged non-drafts; tightening to `NOT is_engaged` brings it to 47
+(a 35% reduction). The 25 PRs that drop out had ≥1 maintainer comment but
+no template marker — they're de-facto triaged.
+
+The age-bucketed variants used by `aggregate.md`:
+
+```text
+is_untriaged_old(pr) := is_untriaged(pr) AND age_bucket(pr) == ">4w"
+is_untriaged_med(pr) := is_untriaged(pr) AND age_bucket(pr) IN {"1-4w"}
+```
+
+The age uses the `last_author_interaction` defined in the
+"Age bucket" section below.
+
+### Side-by-side summary
+
+| Tier | Predicate | Means | Dashboard card |
+|---|---|---|---|
+| Quality-Criteria-triaged | `is_triaged` | maintainer posted the literal 
`Pull Request quality criteria` link | **Quality Criteria triaged** (hero row 
2, blue) |
+| De-facto triaged | `is_engaged AND NOT is_triaged` | maintainer engaged but 
no marker | **De-facto triaged** (hero row 2, amber — the gap signal) |
+| AI-triaged | `is_ai_triaged` | comment with the AI-attribution footer | 
**AI-triaged** (hero row 2, purple — accounting) |
+| Engaged (overall) | `is_engaged` | union of the above two | not a card on 
its own; equals `triaged + defacto_triaged` |
+| Untriaged | `is_untriaged` | NOT engaged + contributor + non-bot + not 
ready-labelled | **Untriaged non-drafts** (hero row 1) |
+
+The number of PRs in each tier sums to a clean partition of the contributor
+non-draft pool minus the ready-labelled set:
+
+```text
+contributor_nondraft_not_ready =
+    triaged_nondraft +              # Quality-Criteria-triaged (`is_triaged`)
+    defacto_triaged_nondraft +      # engaged but no marker
+    untriaged_nondraft              # no maintainer engagement
+```
+
+The `transitional` cases (e.g. a freshly-engaged PR whose triage marker
+hasn't been posted yet) are not a separate category; they sit in
+de-facto-triaged until they pick up the marker or the ready label.
+
+---
+
 ## Triage marker
 
 A PR is *triaged* when it has at least one comment that:
@@ -134,6 +287,46 @@ The `Ready` column counts PRs carrying the `ready for 
maintainer review` label.
 
 ---
 
+## Stats-only vs action-only triage
+
+The strict `is_triaged` definition (the literal marker scan, [Triage 
marker](#triage-marker)
+above) remains the source of truth for the **action-related** flows
+(`pr-management-triage` row 3-4 detection, sweep 1a's "stale triaged drafts"
+threshold). The broader `is_engaged` definition is **stats-only** — it does
+not gate any mutation. This keeps the action-flow conservatism while letting
+the dashboard surface the fuller picture.
+
+For the configurable AI-attribution detection substring, adopters can override
+[`<project-config>/pr-management-config.md`](../../../projects/_template/pr-management-config.md)'s
+`ai_attribution_substring` field; the framework defaults to
+`AI-assisted triage tool`. The literal substring is a single point of failure
+— keep it identical between the comment templates and this detector.
+
+---
+
+## `is_bot` — author is a recognised bot
+
+```text
+is_bot(login) :=
+    login.lower() ends with "[bot]"
+    OR login.lower() IN {dependabot, renovate, github-actions}
+```
+
+Bot PRs are a **separate dashboard category** counted as `bot_authored` —
+they don't merge into `contributors` or `collaborators`, and they don't trip
+the untriaged or engaged predicates. Their lifecycle is independent: they
+follow automated update / review cycles and are reviewed-and-merged by
+maintainers without going through the triage funnel. Surfacing them in their
+own count keeps the contributor backlog signal clean.
+
+Adopters with project-specific bots not on this list — e.g. a release-bot or
+a CI-helper bot — should extend the `is_bot` match via
+[`<project-config>/pr-management-config.md`](../../../projects/_template/pr-management-config.md)'s
+`bot_logins` setting (a list of additional logins to recognise; the framework
+defaults always apply).
+
+---
+
 ## Responded before close (Table 1 only)
 
 Table 1's `Responded` column measures, per area, how many triaged PRs got an 
author reply *before* they were closed or merged. For a PR in the closed-since 
set:
diff --git a/.claude/skills/pr-management-stats/render.md 
b/.claude/skills/pr-management-stats/render.md
index da6f8e8..d58aeb4 100644
--- a/.claude/skills/pr-management-stats/render.md
+++ b/.claude/skills/pr-management-stats/render.md
@@ -33,18 +33,59 @@ Tuesday, May 6, 2026 · 14:33 UTC · viewer @potiuk · 6-week 
window since 2026-
 
 The title is plain text. Context line includes `<weekday>, <month> <day>, 
<year> · <HH:MM> UTC · viewer @<login> · 6-week window since <cutoff>`. Use the 
fetch-start `<now>`, not render-end, so a slow run remains a single-moment 
snapshot.
 
-### 2. Hero cards (4-column grid)
+### 2. Hero cards — two-row grid
 
-Four equally-sized cards, each one big number with a sub-label:
+Two rows of four cards each:
+
+#### Row 1 — backlog state
 
 | Card | Big number | Sub-label | Colour rule |
 |---|---|---|---|
 | **Repo Health** | the rating label (`✅ Healthy` / `⚠️ Needs attention` / `🔥 
Action needed`) | "based on triage backlog + queue size" | green / amber / red, 
per [`aggregate.md#health-rating`](aggregate.md#health-rating) |
-| **Open PRs (non-bot)** | total open count | `<contrib_count> from 
contributors · <collab_count> collaborator-authored` | blue (informational) |
+| **Open PRs (non-bot)** | `total_contributors + total_collaborators` | 
`<total_non_drafts> non-draft · <total_drafts> draft` (line 
1)<br>`<contrib_count> contributor · <collab_count> collaborator-authored` 
(line 2) | blue (informational) |
 | **Ready for review** | `len(ready_open)` | `<pct>% of contributor queue` | 
green |
-| **Untriaged non-drafts** | `len(untriaged_nondraft)` | `<X> are >4 weeks 
old` | red if >0 are >4w, amber if total > 30, green otherwise |
+| **Untriaged non-drafts** | `len(untriaged_nondraft)` (uses 
[`is_untriaged`](classify.md#is_untriaged--broad-untriaged)) | `<X> are >4 
weeks old` | red if >0 are >4w, amber if total > 30, green otherwise |
+
+#### Row 2 — triage coverage breakdown
 
-Card layout is responsive: 4-column on wide screens, 2-column on narrow, 
1-column on mobile-width. The big number uses 32px font; sub-labels are 12px 
dim grey.
+This row exposes the **gap between the literal Quality-Criteria marker and
+broad maintainer engagement** so a maintainer can see at a glance how much
+triage is happening that the marker-based count misses.
+
+| Card | Big number | Sub-label | Colour rule |
+|---|---|---|---|
+| **Quality Criteria triaged** | `triaged_waiting + triaged_responded` 
(literal `Pull Request quality criteria` link present in a maintainer comment) 
| `<pct>% of contributor non-drafts` | blue (informational) |
+| **De-facto triaged** | 
[`defacto_triaged`](classify.md#is_engaged--de-facto-triaged) (engaged by a 
maintainer, no Quality-Criteria marker) | `<pct>% of contributor non-drafts` | 
amber (gap signal — the bigger this is vs the Quality-Criteria card, the more 
triage is happening invisibly to the marker counter) |
+| **AI-triaged** | 
[`ai_triaged`](classify.md#is_ai_triaged--ai-assisted-triage) (received an 
AI-assisted triage comment) | `<pct>% of Quality-Criteria-triaged` | grey 
(informational) |
+| **Bot PRs** | 
[`bot_authored`](classify.md#is_bot--author-is-a-recognised-bot) | 
`<dependabot> dependabot · <other> other` | grey (separate lifecycle, surfaced 
for accounting parity) |
+
+The **De-facto triaged** card is the key new signal: on a large `<upstream>`
+queue (~457 human-authored open PRs) the Quality-Criteria count typically
+captures ~21% of PRs but the broad `is_engaged` count captures ~59% — the
+gap (~38 percentage points) is PRs that maintainers engaged with through
+review threads, direct comments, or label-adds without leaving the templated
+`Pull Request quality criteria` marker. Surfacing the gap lets the maintainer
+team see how much queue health the marker-based counter under-states.
+
+The **Bot PRs** card is a separate accounting category — bot-authored PRs
+follow their own automated lifecycle and don't merge into the
+`contributors` / `collaborators` split. Their count is informational; the
+"Untriaged" hero card never includes them (bots are excluded via
+[`is_untriaged`](classify.md#is_untriaged--broad-untriaged)).
+
+The **Open PRs** card's sub-label is a two-line breakdown: the first line 
splits
+the total by draft state, the second line splits by author class (contributor
+vs. collaborator). Both splits sum to the same total (modulo bot exclusion at
+fetch time).
+
+The **Untriaged non-drafts** card uses the refined `is_untriaged` predicate 
from
+[`classify.md`](classify.md#is_untriaged--broad-untriaged) — bots, 
collaborator-
+authored PRs, and PRs already carrying `ready for maintainer review` are NOT
+counted here.
+
+Card layout is responsive: 4-column on wide screens (8 cards total in two 
rows),
+2-column on narrow, 1-column on mobile-width. The big number uses 32px font;
+sub-labels are 12px dim grey.
 
 ### 3. What needs attention (action panel)
 
@@ -160,18 +201,59 @@ A second hero grid, same layout as the top one, showing 
the funnel-health summar
 
 This grid completes the dashboard: hero cards at the top (queue size + 
immediate red flags), recommendations next (what to do), velocity + 
opened-vs-closed (momentum), pressure by area (where), and triage funnel 
(process health).
 
+### 9b. Triager activity panel
+
+Sourced from 
[`aggregate.md#triager-activity-per-triager-per-week`](aggregate.md#triager-activity-per-triager-per-week).
+A ranked table of the top 15 maintainers by **distinct PRs engaged** across
+the last 6 calendar weeks, with **AI/manual split per week**.
+
+Layout — one row per maintainer, columns:
+
+| Column | Content |
+|---|---|
+| `Triager` | `@<login>` with link to GitHub profile |
+| `Total` | Total distinct PRs across the 6-week window (`ai + manual` 
collapsed) |
+| `AI` | Sub-total of PRs where the maintainer's engagement included the 
AI-attribution footer |
+| `Manual` | Sub-total of PRs without AI footer |
+| `W-5`…`This wk` | One column per rolling week, each cell rendered as a small 
`ai / manual` split (e.g. `12/3` = 12 AI + 3 manual) |
+| `Trend` | Inline 6-bar sparkline of the per-week totals, AI portion stacked 
over manual |
+
+Sort by `Total` descending. Bot accounts excluded via
+[`is_bot`](classify.md#is_bot--author-is-a-recognised-bot).
+
+The AI / Manual split makes one specific question visible: of each maintainer's
+triage throughput, how much is going through the `pr-management-triage` skill
+(detected via the AI-attribution footer substring) vs. through direct
+review-thread typing. Both modes are *real triage* — the maintainer approved
+every AI-drafted comment before it posted — but the split surfaces where the
+skill is providing leverage vs. where the maintainer is doing the writing
+themselves. A high AI ratio for a maintainer typically means they ran a
+batched triage sweep; a high manual ratio means they did deep-review
+conversation.
+
+Below the table, a one-line summary:
+
+```text
+6-week throughput: <N_ai> AI-assisted / <N_manual> manual / <N_total> total
+                   across <N_triagers> active maintainers
+```
+
+Empty state: when there are zero engagements in the window (a quiet 6 weeks),
+render a single low-priority card with the message "No triager activity in
+the last 6 weeks — quiet window or fetch shape missing comment data."
+
 ### 10. Detailed tables (collapsed `<details>` blocks)
 
 Two `<details>` elements, each opening into a compact area-grouped table:
 
-- **Triaged PRs — Final State since `<cutoff>`** — same structure as the 
[Table 1](#table-1--triaged-prs-final-state) section below.
-- **Triaged PRs — Still Open** — same structure as [Table 
2](#table-2--triaged-prs-still-open) below, possibly compact (drop the 8 
age-bucket columns) for screen width.
+- **Triaged PRs — Final State since `<cutoff>`** — same structure as the Table 
1 section below.
+- **Triaged PRs — Still Open** — same structure as the Table 2 section below, 
possibly compact (drop the 8 age-bucket columns) for screen width.
 
 Both tables are HTML-rendered with the same colour scheme as the dashboard. 
Maintainers who want the raw per-area numbers click to expand; the default view 
is the dashboard sections above.
 
 ### 11. Legend / methodology
 
-A bordered panel at the bottom explaining all the colours, columns, and 
computed values. Critical content because the dashboard packs a lot of distinct 
numbers into small footprints. See [`#legend`](#legend) below for the verbatim 
block.
+A bordered panel at the bottom explaining all the colours, columns, and 
computed values. Critical content because the dashboard packs a lot of distinct 
numbers into small footprints. See the Legend section below for the verbatim 
block.
 
 ---

(airflow-steward) branch main updated: fix(pr-management-stats): refine is_untriaged predicate + surface draft/non-draft split (#219)

Reply via email to