[PR] feat(agent-guard): deterministic PreToolUse guard dispatcher + skill-contributable guards [airflow-steward]

via GitHub Thu, 11 Jun 2026 06:32:50 -0700


potiuk opened a new pull request, #494:
URL: https://github.com/apache/airflow-steward/pull/494


   ## Summary
   
   - Adds `tools/agent-guard`: a stdlib-only Claude Code **`PreToolUse`** hook 
that inspects every `Bash` command *before it runs* and **denies** the ones 
that would break a hard framework rule — protections that must not depend on 
the model remembering a `SKILL.md` instruction.
   - Five bundled guards: **mention** (the denoise rule from #491 — never 
`@`-ping a non-author in an author-directed comment; never `@`-mention anyone 
in a `gh pr edit --body` fold), **commit-trailer** (no `Co-Authored-By:`; use 
`Generated-by:`), **mark-ready** (Golden rule 1b — no `ready for maintainer 
review` label while CI awaits approval), **security-language** (no CVE/security 
wording in a public `gh pr create|edit` title/body), **empty-rebase** (no 
force-push of a 0-commit branch → would auto-close the PR).
   - **Extensible, wired once:** guards are discovered at runtime from 
`guards.d/` (+ `$STEWARD_GUARD_DIRS`) via a `GuardContext` API, so any skill 
contributes a guard by dropping one import-free `guard(ctx)` file — **no 
`settings.json` change**. Ships an example (`no_verify_commit.py`) + discovery 
tests.
   - Each guard is overridable per command (`STEWARD_ALLOW_*`); 
`STEWARD_GUARD_OFF=1` disables all; non-`gh`/`git` commands fast-path (~22 
ms/call).
   
   ## Type of change
   
   - [x] Python package (`tools/*/` with `pyproject.toml`)
   - [x] Skill change — setup skills wired (no behavioural eval fixtures; these 
are wiring-prose changes, see Notes)
   - [x] Cross-cutting (AGENTS.md, sandbox/secure-setup docs)
   - [x] Documentation (`docs/setup/secure-agent-setup.md`, 
`docs/labels-and-capabilities.md`)
   
   ## Test plan
   
   - [x] `uv run --project tools/agent-guard pytest` — 54 tests (per-guard 
allow/deny, email-not-a-mention, code-span stripping, body-file, author 
fail-closed, every override, discovery from `guards.d` + env, broken-guard 
fails open, `main()` stdin contract)
   - [x] ruff / ruff-format / mypy clean; `prek` workspace suite green on commit
   - [x] `skill-and-tool-validate` + `check-workspace-members` green (new tool: 
README + capability row + workspace member)
   - [x] Live stdin smoke: deny JSON emitted for each guard; plain command → no 
output, exit 0
   - [ ] No eval fixtures for the setup-skill edits — they are wiring prose, 
not LLM-routed behaviour (the guard logic is covered by pytest, not skill-evals)
   
   ## RFC-AI-0004 compliance
   
   - [x] **HITL** — guards never mutate; they block-with-reason and surface a 
per-command override, leaving the decision to the human
   - [x] **Sandbox** — no new host access; the hook only inspects commands the 
agent could already run, and shells out to `gh`/`git` only after a cheap 
trigger match
   - [x] **Vendor neutrality** — guard code carries no project names; the 
ready-label string is `$STEWARD_READY_LABEL`-configurable
   - [x] **Write-access discipline** — this change *strengthens* the principle: 
it deterministically blocks autonomous maintainer pings and premature 
ready-labelling
   - [x] **Conversational + correctable** — every guard is overridable inline; 
the dispatcher is disableable wholesale
   
   ## Linked issues
   
   <!-- none — follows the denoise discussion (dev@, #491) -->
   
   ## Notes for reviewers
   
   - **`security-language` is deliberately the narrowest guard** — scoped to 
`gh pr create|edit` title/body only, NOT comments, so it does not collide with 
the triage `security_language_signal` warning comment that intentionally quotes 
matched text. Highest false-positive risk -> narrow scope + 
`STEWARD_ALLOW_SECURITY_LANG=1`.
   - **`mention` resolves the PR/issue author via one `gh api` call**, only 
when a comment actually contains an `@`-mention; fails *closed* (deny) if the 
author can't be verified, with the override as the escape hatch.
   - **`empty-rebase` and `mark-ready` fail *open*** (allow) when their git/API 
lookups can't resolve — they never block a legitimate command on a transient 
error.
   - **`.claude/settings.json` wiring is assistant-proposes / human-applies** — 
that file is agent-edit-denied, so the setup skills surface the one-time 
`hooks.PreToolUse` snippet rather than writing it. Dogfooding it in this repo 
is a manual one-liner (in the PR'd docs).
   - **Setup-skill edits aren't eval-tested** — they're wiring instructions; 
the enforcement logic lives in `tools/agent-guard` with its own pytest suite.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] feat(agent-guard): deterministic PreToolUse guard dispatcher + skill-contributable guards [airflow-steward]

Reply via email to