justinmclean commented on PR #228:
URL: https://github.com/apache/airflow-steward/pull/228#issuecomment-4539130129
Pre-flight self-review — PR #228 (contributor-activity-sweep)
https://github.com/apache/airflow-steward/pull/228 · draft · author:
justinmclean
Base: main · Files changed: 37 (~all added) · Diff size: +756 / −0
A new Pairing/Triage-flavoured read-only activity card with a 12-case eval
suite (step-0 input validation × 4, step-1 review classification × 5,
step-2
render × 3), plus a 9-line tweak to contributor-nomination/render.md and a
2-line tweak to tools/skill-evals/README.md.
Correctness
- [advisory] SKILL.md heading vs eval directory naming. The eval directory
is
step-1-classify-reviews, but its step-config.json extracts SKILL.md's ##
Step
1 — Fetch GitHub activity section. Not a bug — that section does contain
the
substantive/LGTM classification rules (comments.totalCount >= 3 …), so the
eval reads the right text — but the names disagree at a glance and a future
reader will trip on it. Consider renaming the heading to ## Step 1 — Fetch
and
classify activity, or split into a Step 1 (Fetch) + Step 2 (Classify) so
directory names match headings.
(Verified clean: eval output-spec keys match expected.json keys exactly
across
all three suites — no schema drift like the one I caught in
pairing-self-review. The classify-stream's keys are
injection_attempt_detected
/ lgtm_only_reviews / substantive_reviews / total_inline_comments /
total_reviews, consistent across all 5 cases.)
Security
No findings. Strong injection-guard callout in the SKILL.md (2 matches).
Comprehensive adversarial coverage in the eval suite:
- step-0 case-2-unsafe-handle — path-traversal rejection on the GitHub
login.
- step-0 case-4-shell-metachar — shell-metacharacter rejection.
- step-1 case-5-injection-in-body — embedded SYSTEM: instruction in a
review
body must be flagged and not obeyed; classification driven by actual
content,
not the inflated payload.
- step-2 case-2-injection-flagged — AGENT OVERRIDE in a PR title must be
flagged with no verdict language emitted.
That's a solid four security-adversarial cases for a 12-case suite.
Read-only
skill, no GitHub mutations.
Conventions
- [advisory] skill-validate --strict flags one SOFT-category violation on
this
skill:
▎ .../contributor-activity-sweep/SKILL.md:1: action-inventory in
description
(5 commas) — consider moving the enum to body: 'Output is intentionally
limited to GitHub-visible activity — mailing list, docum…'
▎ The frontmatter description enumerates off-GitHub channels ("mailing
list,
documentation, user support, mentoring, talks, and release management").
Under
non-strict this is a warning, not a failure. Move that enum to the body
to
keep the matching-layer description tight. (Same finding shape as the
pre-existing one on security-tracker-stats-dashboard.)
(Verified clean: SPDX header present; ## Step headings well-formed; eval
step-config.json files point at real SKILL.md sections; full eval suite
ships
per the AGENTS.md rule. The unrelated 2nd validator violation is
security-tracker-stats-dashboard, pre-existing.)
Summary
Ready to push with one small tightening — no blocking findings. The
--strict
description-comma warning is the kind of thing the maintainer review will
mention anyway; rename consideration on the Step 1 heading is judgment.
Blocking: 0 Advisory: 2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]