potiuk opened a new pull request, #344:
URL: https://github.com/apache/airflow-steward/pull/344

   ## Summary
   
   Four small, independent classifier fixes calibrated from a real triage 
session on `apache/airflow` (2026-05-27). Each was a case where the rule fired 
technically correctly but the maintainer manually overrode the proposed action 
because the rule didn't match real-world signal. Each commit is independently 
revertible.
   
   ## Changes
   
   | Commit | Change | Why |
   |---|---|---|
   | `9afd558` | Match Copilot bots without requiring `[bot]` suffix in login | 
GitHub's GraphQL `Actor.login` returns `copilot-pull-request-reviewer` without 
the `[bot]` suffix; strict suffix matching excluded real Copilot threads from 
row 2 (`stale_copilot_review`). Saw a PR with 55-day-old unresolved Copilot 
threads classify as `comment` instead of `draft`. |
   | `91b5ead` | Insert row 12b: mixed static-check + non-static failures → 
`comment`, not `rerun` | Row 12 fired only when ALL failed checks were static; 
row 13 then proposed `rerun` even when one of two failures was a static check. 
Rerunning a static-check failure is wasted — needs a code fix. Saw a PR with 
`Static checks: FAILURE` + `Compat 3.0.6: FAILURE` (one static + one 
possibly-flaky) classify as `rerun`. |
   | `878ab1f` | Widen F5b to walk the last 5 collaborator comments, not just 
the latest | F5b only inspected the most recent collab comment for unanswered 
`@`-mentions. Missed the case where maintainer A pings B in comment N, viewer 
posts a quality comment in N+1 (orthogonal), and B's reply is still pending. 
Saw on a PR where vincbeck pinged bbovenzi 2d ago, viewer drafted yesterday for 
quality issues, classifier proposed `request-author-confirmation` today — 
talking over the older ping. |
   | `b284aff` | Add row 0 `first_time_stale_abandoned` → `skip` | Row 1 
(`approve-workflow`) fired for first-time-contributor PRs with no push for 2+ 
months since prior triage. Re-approving CI on stalled code re-fails on the same 
unaddressed quality issues. New row 0 (FIRST_TIMER + viewer triage marker + no 
commits since marker + ≥30d) routes to `skip`; stale-sweep retires it. |
   
   ## Origin
   
   Came out of the same triage session on `apache/airflow` (2026-05-27) that 
produced PR #343 (session-history gist persistence). The session's gist 
captured each manual override and the four patterns above stood out as 
systematic classifier mis-calibrations rather than per-PR judgment calls. PR 
#343 builds the infrastructure to make this kind of cross-session signal 
repeatable.
   
   ## Test plan
   
   - [ ] Copilot match: PR with unresolved review thread by 
`copilot-pull-request-reviewer` (no `[bot]` suffix) ≥ 7d old → routes to 
`stale_copilot_review`
   - [ ] Row 12b: PR with mixed static + non-static failures → routes to 
`comment`
   - [ ] F5b deep scan: PR with maintainer A→B `@`-mention 5 comments ago, 
viewer comment in between → routes to F5b skip
   - [ ] Row 0: FIRST_TIMER with viewer triage marker, no commits since, ≥30d → 
routes to `skip`, NOT `approve-workflow`
   
   🤖 Generated with [Claude Code (Opus 4.7)](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to