(airflow-steward) branch main updated: docs(spec): add reviewer-routing and skill-reconciler specs; prune completed plan items (#586)

potiuk Sat, 27 Jun 2026 03:15:19 -0700

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git



The following commit(s) were added to refs/heads/main by this push:
     new 7c99447  docs(spec): add reviewer-routing and skill-reconciler specs; 
prune completed plan items (#586)
7c99447 is described below

commit 7c9944770f92248a44295f42bb97fa882a20712c
Author: Justin Mclean <[email protected]>
AuthorDate: Sat Jun 27 20:15:04 2026 +1000

    docs(spec): add reviewer-routing and skill-reconciler specs; prune 
completed plan items (#586)
    
    Two proposed-status functional specs (reviewer-routing, Triage; and
    skill-reconciler, infra) plus their index entries in overview.md and
    specs/README.md. In IMPLEMENTATION_PLAN, remove the merged in-flight batch
    and the six shipped planned items (recorded under What's been built), fix
    stale references, and list the current spec work items. No eval/local-smoke
    content (that lives on the local-smoke branches).
---
 tools/spec-loop/IMPLEMENTATION_PLAN.md    | 218 +++++++++++-------------------
 tools/spec-loop/specs/README.md           |   4 +-
 tools/spec-loop/specs/overview.md         |   2 +
 tools/spec-loop/specs/reviewer-routing.md | 123 +++++++++++++++++
 tools/spec-loop/specs/skill-reconciler.md | 136 +++++++++++++++++++
 5 files changed, 346 insertions(+), 137 deletions(-)

diff --git a/tools/spec-loop/IMPLEMENTATION_PLAN.md 
b/tools/spec-loop/IMPLEMENTATION_PLAN.md
index 8c955e8..6778dd0 100644
--- a/tools/spec-loop/IMPLEMENTATION_PLAN.md
+++ b/tools/spec-loop/IMPLEMENTATION_PLAN.md
@@ -59,6 +59,29 @@ one PR** (the branch-per-feature constraint).
   setup-isolated-setup-install, setup-isolated-setup-update,
   setup-isolated-setup-verify, setup-override-upstream,
   setup-shared-config-sync).
+- **Release-management — first four skills shipped** —
+  `release-vote-draft`, `release-announce-draft`, `release-vote-tally`,
+  and `release-verify-rc` landed with eval suites (formerly planned work
+  items 1–2 plus two follow-ups). Six `release-*` skills remain; see
+  
[`specs/release-management-lifecycle.md`](specs/release-management-lifecycle.md).
+- **Triage — general-issue family filled out** — `issue-stale-sweep`,
+  `issue-deduplicate`, and `issue-backlog-stats` shipped with eval suites
+  (formerly planned work item 3 plus its deferred siblings).
+  Spec: [`specs/triage-mode.md`](specs/triage-mode.md).
+- **Mentoring — first-contribution welcome shipped** — `mentoring-welcome`
+  landed with an eval suite (formerly planned work item 4).
+  Spec: [`specs/mentoring-mode.md`](specs/mentoring-mode.md).
+- **Project-agnosticism — ASF-coupling advisory lint shipped** — the SOFT
+  ASF-coupling category landed in `tools/skill-and-tool-validator`
+  (formerly planned work item 5), and `drafting-mode.md` Known Gaps is
+  synced to the shipped drafting skills (formerly planned work item 6).
+- **Repo-health — three-skill family shipped** — `ci-runner-audit`,
+  `workflow-security-audit`, and `dependency-audit` landed (read-only,
+  `experimental`). Spec: 
[`specs/repo-health-family.md`](specs/repo-health-family.md).
+- **New proposed specs awaiting their first build item** —
+  [`specs/reviewer-routing.md`](specs/reviewer-routing.md) (Triage) and
+  [`specs/skill-reconciler.md`](specs/skill-reconciler.md) (infra) are
+  documented spec-first; their build items are below.
 
 ---
 
@@ -67,25 +90,13 @@ one PR** (the branch-per-feature constraint).
 The following items are already built on local branches or open as PRs.
 Do not duplicate them.
 
-| Branch slug | PR | Description |
-|---|---|---|
-| `injection-guard` | merged (#473) | Prompt-injection hardening on 
forwarder-relay ingest |
-| `check-headers` | #474 | License-header enforcement check in spec-validator |
-| `spec-validator-known-gaps` | #490 | Enforce Known-gaps section in every 
functional spec |
-| `spec-validate-hook` | #489 | pre-commit hook for spec-validate |
-| `skill-quality-fix` | #488 | Stabilise setup-verify eval + extend check-1 
coverage |
-| `check-eval-coverage` | #481 | SOFT eval-coverage check (check #8) |
-| `eval-quick-merge` | #480 | pr-management-quick-merge skill + evals |
-| `spec-validator-path-check` | local | Validate paths referenced in 
Validation blocks |
-| `spec-validator-spdx` | local | Enforce SPDX header on spec files |
-| `tracker-dashboard-tests` | local | pyproject + pytest suite for 
security-tracker render.py |
-| `loop-imp` | #467 | Incremental update runs from .last-sync marker |
-| `loop-cli-ux` | #472 | Explicit loop.sh argument handling |
-| `node-bump-markdownlint` | local | Node 22.13→22.20 bump for markdownlint |
-| `token-reduction` | #479 | Slim AGENTS.md into a glossary |
-| `docs-modes-sync` | #483 | Sync modes.md skill inventory |
-| `docs-mentoring-sync` | #482 | Sync mentoring spec to experimental |
-| `eval-setup-status` | #484 | Fix setup-status eval prompts |
+None in flight. The previous in-flight batch (spec-validator SPDX /
+path-existence / Known-gaps checks, the `spec-validate` pre-commit hook,
+the SOFT eval-coverage check, `pr-management-quick-merge`, the
+security-tracker dashboard pytest suite, the loop incremental-sync and
+CLI-UX changes, the markdownlint Node bump, the AGENTS.md slim, and the
+modes / mentoring / setup-status doc syncs) has all merged and is
+reflected in the code and in **What's been built** above.
 
 ---
 
@@ -94,109 +105,48 @@ Do not duplicate them.
 Priority order. Each maps to one branch and one PR. Branch names are
 slugs, not numbers (numbering implies an order the specs don't carry).
 
-1. **First release-management skill: release-vote-draft.**
-   `specs/release-management-lifecycle.md` is the only `proposed` spec
-   with zero implemented skills. The adopter contract templates
-   (`projects/_template/release-management-config.md`,
-   `release-build.md`, `pmc-roster.md`, `release-trains.md`,
-   `site-repo.md`) already exist. `release-vote-draft` is the most
-   standalone and highest-frequency PMC task: it takes RC metadata
-   (project name, version, RC number, artifact URLs) and produces a
-   VOTE email draft following ASF conventions. Include an eval suite
-   in `tools/skill-evals/evals/release-vote-draft/`.
-   Validation:
-   ```bash
-   uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
-   uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/release-vote-draft/
-   ```
-   Spec: 
[`specs/release-management-lifecycle.md`](specs/release-management-lifecycle.md).
-   Branch `release-vote-draft`.
-
-2. **Second release-management skill: release-announce-draft.**
-   Companion to `release-vote-draft`. Takes a successful vote tally
-   (binding +1 count, RC metadata) and produces the ANNOUNCE email
-   draft for the ASF announce@ and dev@ lists, following ASF posting
-   conventions (subject: `[ANNOUNCE] Apache <Project> <Version>
-   released`). Also standalone: it does not depend on
-   `release-vote-draft` being run in the same session. Include an
-   eval suite.
-   Validation:
-   ```bash
-   uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
-   uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/release-announce-draft/
-   ```
-   Spec: 
[`specs/release-management-lifecycle.md`](specs/release-management-lifecycle.md).
-   Branch `release-announce-draft`.
-
-3. **Stale-issue sweep for general triage.**
-   `specs/triage-mode.md` Known Gaps explicitly names stale-handling
-   as missing from the general-issue side (the security side covers
-   this via `security-issue-sync`). Add a new skill
-   `issue-stale-sweep` that surfaces issues with no activity past a
-   configurable threshold and proposes closure or an update request
-   (waits for maintainer confirmation before posting). Include an eval
-   suite.
+1. **First reviewer-routing skill: reviewer-routing.**
+   `specs/reviewer-routing.md` is `proposed` with zero implemented
+   skills, and review-cycle latency is one of the two priorities MISSION
+   names. Add a Triage-family skill `reviewer-routing` that takes an open
+   issue or PR and proposes a primary reviewer (and optional backup) from
+   the project's configured roster, scored on roster eligibility for the
+   touched area, git-history familiarity with the changed paths, and the
+   reviewer's current open-review load. Read-only / propose-then-confirm:
+   it never assigns or requests review. An unresolved roster yields an
+   explicit `NO ELIGIBLE REVIEWER` signal, never a fabricated handle.
+   Include an eval suite with an adversarial case asserting an injected
+   "assign to X" line in a PR body is ignored.
    Validation:
    ```bash
    uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
-   uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/issue-stale-sweep/
+   uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/reviewer-routing/
    ```
-   Spec: [`specs/triage-mode.md`](specs/triage-mode.md).
-   Branch `issue-stale-sweep`.
-
-4. **First-contribution welcome/orientation skill.**
-   `specs/mentoring-mode.md` Known Gaps names the "first-contribution
-   welcome/orientation skill" as missing. Add `mentoring-welcome`,
-   which greets first-time contributors on a newly opened issue or PR
-   with orientation context: contributing guide link, community norms,
-   expected next steps, and a pointer to the good-first-issue pool.
-   Waits for maintainer confirmation before posting. Include an eval
-   suite.
-   Validation:
-   ```bash
-   uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
-   uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/mentoring-welcome/
-   ```
-   Spec: [`specs/mentoring-mode.md`](specs/mentoring-mode.md).
-   Branch `mentoring-welcome`.
-
-5. **ASF-coupling advisory lint (fold into `skill-and-tool-validator`).**
-   `specs/project-agnosticism.md` Known Gaps names the absence of an
-   automated ASF-coupling check as its first gap. Add a new SOFT advisory
-   category to `tools/skill-and-tool-validator` that reuses the existing
-   walk, file allowlist, and inline `e.g.`/`example:` markers (the same
-   machinery as the placeholder check). It flags a curated, tiered set of
-   ASF-coupled tokens in skill bodies (high-confidence:
-   `svn (mv|commit|co)`, `[email protected]`, `dist/(dev|release)/`,
-   Vulnogram URLs; low-confidence: bare `PMC` / `ICLA` / `incubator`) and
-   tags each hit with a remedy class (placeholder / adapter /
-   capability-flag). SOFT only: surfaces on stderr, never fails the build.
-   Extend the validator tests with a coupled fixture and an allowlisted
-   fixture.
+   Spec: [`specs/reviewer-routing.md`](specs/reviewer-routing.md).
+   Branch `reviewer-routing`.
+
+2. **Cross-project skill reconciler: skill-reconciler.**
+   `specs/skill-reconciler.md` is `proposed` with no implementation. Add
+   a meta/infra-family skill `skill-reconciler` that compares two
+   near-duplicate skills (two `source`-tagged copies, e.g. an ASF and a
+   non-ASF variant) and emits a structured diff plus a reconciliation
+   proposal, labelling every difference `ALLOWED`, `DRIFT`, or
+   `SAFETY-BASELINE`. Read-only: it proposes, it never rewrites either
+   skill (convergence is a separate confirmed `write-skill` /
+   `optimize-skill` edit). A safety-baseline divergence is always a
+   must-fix and never folded into allowed-divergence noise. First cut may
+   take two explicit paths rather than auto-pairing by `source` tag.
+   Include an eval suite with a case where the two copies diverge only on
+   the safety baseline and the reconciler must flag it.
    Validation:
    ```bash
-   uv run --project tools/skill-and-tool-validator --group dev pytest
    uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
+   uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/skill-reconciler/
    ```
-   Spec: [`specs/project-agnosticism.md`](specs/project-agnosticism.md).
-   Branch `asf-coupling-lint`.
-
-6. **Sync drafting-mode spec Known Gaps to reflect shipped skills.**
-   `specs/drafting-mode.md` Known Gaps still says "Generic
-   (non-security, non-issue) Drafting from audit-tool findings is
-   `proposed`", but `audit-finding-fix` shipped with a full eval suite.
-   Update the Known Gaps section to reflect the current state and
-   remove the stale `proposed` claim so new plan passes do not
-   re-raise this as a gap.
-   Validation:
-   ```bash
-   uv run --project tools/spec-validator --group dev spec-validate 
tools/spec-loop/specs/
-   uv run --project tools/spec-validator --group dev pytest
-   ```
-   Spec: [`specs/drafting-mode.md`](specs/drafting-mode.md).
-   Branch `drafting-spec-sync`.
+   Spec: [`specs/skill-reconciler.md`](specs/skill-reconciler.md).
+   Branch `skill-reconciler`.
 
-7. **Non-ASF adopter profile fixture + smoke eval.**
+3. **Non-ASF adopter profile fixture + smoke eval.**
    `specs/project-agnosticism.md` acceptance #3 requires that a non-ASF
    profile can be declared without editing any skill body, but there is
    no fixture to prove it. Add a worked non-ASF profile under
@@ -226,35 +176,31 @@ slugs, not numbers (numbering implies an order the specs 
don't carry).
   it would skip the proof MISSION requires.
 - When a build iteration creates a new skill, its eval suite is part of
   that same work item — not a separate one.
-- **Release-management family:** only the two most standalone skills
-  (`release-vote-draft`, `release-announce-draft`) are planned here.
-  The remaining eight (`release-prepare`, `release-keys-sync`,
-  `release-rc-cut`, `release-verify-rc`, `release-vote-tally`,
+- **Release-management family:** the first four skills (`release-vote-draft`,
+  `release-announce-draft`, `release-vote-tally`, `release-verify-rc`)
+  have shipped and are recorded in **What's been built**. The remaining
+  six (`release-prepare`, `release-keys-sync`, `release-rc-cut`,
   `release-promote`, `release-archive-sweep`, `release-audit-report`)
-  should be planned in subsequent passes once the first two establish
-  the skill-authoring patterns for this family.
+  should be planned in subsequent passes now that the first four have
+  established the skill-authoring patterns for this family.
 - **Triage contributor-growth gaps** (PMC-member nomination,
   emeritus-committer handling, contributor offboarding) noted in
   `triage-mode.md` Known Gaps are intentionally deferred: they are
   vague enough that a spec-RFC conversation is more appropriate than
   a direct build item.
-- **Project-agnosticism:** two of the three gaps in
-  `project-agnosticism.md` are buildable and planned now: the ASF-coupling
-  advisory lint (work item 5) and the non-ASF adopter profile fixture
-  (work item 7). The remaining gap, the capability-flag vocabulary for
-  contributor intake (ICLA vs DCO), security intake, and CVE allocation,
-  is deferred only until someone enumerates the option sets and defaults,
-  following the backend-flag precedent already set by
+- **Project-agnosticism:** the ASF-coupling advisory lint has shipped
+  (recorded in **What's been built**); the non-ASF adopter profile
+  fixture is work item 3. The remaining gap, the capability-flag
+  vocabulary for contributor intake (ICLA vs DCO), security intake, and
+  CVE allocation, is deferred only until someone enumerates the option
+  sets and defaults, following the backend-flag precedent already set by
   `release-management-lifecycle.md` (distribution / approval / announcement
   backends). That is a spec-authoring task, not yet a build item.
 - **General-issue dedupe and backlog dashboard** (`triage-mode.md` Known
-  Gaps) are deferred behind `issue-stale-sweep` (work item 3): dedupe
-  overlaps the existing `security-issue-deduplicate` matching approach and
-  a backlog dashboard overlaps `pr-management-stats`, so both should reuse
-  those patterns once stale-sweep establishes the general-issue skill
-  shape. Not dropped, sequenced after item 3.
-- **Repo-health family** (`triage-mode.md` Known Gaps: the standalone
-  `ci-runner-audit` plus candidate siblings, GitHub Actions security
-  audit, dependency-update triage, license/NOTICE compliance, flaky-test
-  detection) is deferred pending a family spec; it is a multi-skill area
-  that wants its own spec before any build item.
+  Gaps) have shipped (`issue-deduplicate`, `issue-backlog-stats`) alongside
+  `issue-stale-sweep`; see **What's been built**. No longer planned items.
+- **Repo-health family** has shipped its first three members
+  (`ci-runner-audit`, `workflow-security-audit`, `dependency-audit`) under
+  its own [`specs/repo-health-family.md`](specs/repo-health-family.md);
+  remaining candidates (license / NOTICE compliance, flaky-test detection)
+  are deferred to a subsequent pass.
diff --git a/tools/spec-loop/specs/README.md b/tools/spec-loop/specs/README.md
index b6b852e..18884fc 100644
--- a/tools/spec-loop/specs/README.md
+++ b/tools/spec-loop/specs/README.md
@@ -39,7 +39,9 @@ Start with [`overview.md`](overview.md), then:
   [`adapters.md`](adapters.md),
   [`project-agnosticism.md`](project-agnosticism.md),
   [`meta-and-quality-tooling.md`](meta-and-quality-tooling.md),
-  [`security-reporting.md`](security-reporting.md).
+  [`security-reporting.md`](security-reporting.md),
+  [`reviewer-routing.md`](reviewer-routing.md),
+  [`skill-reconciler.md`](skill-reconciler.md).
 
 (Auto-merge, the fifth MISSION mode, is deliberately off and has no
 spec — see the note in [`overview.md`](overview.md).)
diff --git a/tools/spec-loop/specs/overview.md 
b/tools/spec-loop/specs/overview.md
index 25e5fa4..3ba8ab0 100644
--- a/tools/spec-loop/specs/overview.md
+++ b/tools/spec-loop/specs/overview.md
@@ -54,6 +54,8 @@ Each mode is an independently toggleable set of skills. 
Maturity mirrors
 | Adapters (Gmail / PonyMail / Jira / GitHub / mail-source / forwarder-relay / 
mail-archive / github-body-field / github-rollup) | [adapters.md](adapters.md) |
 | Project-agnosticism (de-ASF coupling) | 
[project-agnosticism.md](project-agnosticism.md) |
 | Meta & quality tooling | 
[meta-and-quality-tooling.md](meta-and-quality-tooling.md) |
+| Reviewer routing (proposed, Triage) | 
[reviewer-routing.md](reviewer-routing.md) |
+| Cross-project skill reconciler (proposed, infra) | 
[skill-reconciler.md](skill-reconciler.md) |
 
 ## The non-negotiables every area inherits
 
diff --git a/tools/spec-loop/specs/reviewer-routing.md 
b/tools/spec-loop/specs/reviewer-routing.md
new file mode 100644
index 0000000..9f3989a
--- /dev/null
+++ b/tools/spec-loop/specs/reviewer-routing.md
@@ -0,0 +1,123 @@
+<!-- SPDX-License-Identifier: Apache-2.0
+     https://www.apache.org/licenses/LICENSE-2.0 -->
+
+---
+title: Reviewer routing
+status: proposed
+kind: feature
+mode: Triage
+source: >
+  MISSION.md § Rationale ("review-cycle latency" is one of the two named
+  priorities) and § Technical scope (Triage: "proposes initial routing",
+  "proposes routing"). The substrate config already declares "who
+  reviews" (overview.md § Substrate; projects/_template adopter config),
+  but no skill turns that roster plus repository signal into an assignee
+  suggestion. triage-mode.md § What it does ("propose routing to the
+  right human") names the behaviour; no skill implements it yet.
+acceptance:
+  - The skill is read-only on tracker state and proposes-then-confirms;
+    it never assigns, requests review, or labels without confirmation.
+  - The suggested reviewer is drawn from the project's configured roster
+    only; the skill never invents a handle or routes to a non-member.
+  - Every suggestion carries its reasoning (touched paths, prior-art
+    PRs, current open-review load) so the maintainer can audit the call.
+---
+
+# Reviewer routing
+
+## What it does
+
+Proposes which maintainer an inbound issue or PR should go to. The two
+complaints MISSION names loudest are onboarding latency and review-cycle
+latency; routing attacks the second directly by removing the "who should
+look at this?" pause that stalls a fresh PR before any review begins.
+
+Given an open issue or PR, the skill scores the configured reviewer
+roster and proposes a primary reviewer (and optionally a backup),
+grounding each suggestion in three signals: roster eligibility for the
+touched area, git-history familiarity with the changed paths, and the
+reviewer's current open-review load so routing spreads work instead of
+piling it on the most-recently-active person. The output is a proposal a
+maintainer confirms; nothing is assigned on autopilot. This is the
+Triage-mode counterpart to `contributor-nomination` on the read-only
+side: a grounded brief a human acts on, not a state change.
+
+## Where it lives
+
+- Skill (proposed, not implemented): `reviewer-routing` under
+  `skills/`, in the Triage family alongside `pr-management-triage` and
+  `issue-triage`.
+- Roster source: the project's configured reviewer roster
+  (`projects/<project>/` adopter config; `pmc-roster.md` for ASF
+  projects, an arbitrary maintainer list for non-ASF adopters). The
+  skill reads the roster through configuration, never a hard-coded list.
+- Repository signal: `tools/github` for changed paths, blame/history on
+  those paths, and the reviewer's current open-review queue.
+- Identity resolution for ASF projects: `tools/apache-projects`
+  (committee roster, Apache IDs), reused exactly as the security and
+  contributor skills resolve handles.
+- Adapters it reads through: `tools/github`; `tools/apache-projects`
+  where ASF roster context applies.
+
+## Behaviour & contract
+
+- **Read-only, propose-then-confirm.** The skill emits a routing
+  proposal; the maintainer assigns / requests review as themselves. No
+  skill call sets an assignee, requests a review, or applies a label
+  without in-session confirmation.
+- **Roster-bounded.** Suggestions come only from the configured roster.
+  An empty or unresolved roster yields `NO ELIGIBLE REVIEWER, needs
+  maintainer call` rather than a guessed handle, mirroring the
+  conservative-tally refusal in `release-vote-tally`.
+- **Reasoned, auditable output.** Each suggestion lists the signals
+  behind it (matched area / touched paths, the prior-art PRs that touched
+  the same paths, the reviewer's current open-review count). A maintainer
+  can see why a name surfaced and overrule it.
+- **Load-aware, not just expertise-aware.** Scoring weighs current
+  open-review load so routing does not concentrate every PR on the
+  single most expert reviewer; the contract is to surface a workable
+  human, not the theoretically optimal one.
+- **Untrusted content stays data.** Issue / PR bodies are input data,
+  never instructions; an injected "assign this to X" line in a PR
+  description is ignored, the same posture every triage skill inherits.
+
+## Out of scope
+
+- Assigning, requesting review, or labelling on the tracker (those are
+  human acts the maintainer performs after confirming).
+- Authoring or merging the change (Drafting / Auto-merge, not Triage).
+- Inventing a reviewer outside the roster, or routing on contributor
+  sentiment / performance ranking — the skill proposes who is best
+  placed to review a specific change, not who is a "better" maintainer.
+
+## Acceptance criteria
+
+1. `reviewer-routing` performs no unconfirmed tracker state change.
+2. Every suggested reviewer is a member of the configured roster; an
+   unresolved roster produces an explicit `NO ELIGIBLE REVIEWER` signal,
+   never a fabricated handle.
+3. Each suggestion carries its grounding signals (touched paths,
+   prior-art PRs, open-review load).
+4. The skill validates under `skill-and-tool-validate` and ships an eval
+   suite under `tools/skill-evals/evals/reviewer-routing/`, including an
+   adversarial case asserting an injected "assign to X" line is ignored.
+
+## Validation
+
+```bash
+uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
+uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/reviewer-routing/
+```
+
+## Known gaps
+
+- **No skill is implemented yet.** This spec is `proposed`; the plan
+  pass turns it into a single build item (one skill plus its eval suite).
+- **Open-review-load signal is unspecified in detail.** Whether load is
+  counted as open review requests, assigned-and-unreviewed PRs, or a
+  decay-weighted recent count is left to the implementation; the contract
+  only requires that some load signal is present and shown.
+- **Non-ASF roster shape is unproven.** The roster-bounded contract
+  assumes an adopter declares a maintainer list; no non-ASF profile
+  fixture exercises routing yet (overlaps the non-ASF adopter profile
+  work item in IMPLEMENTATION_PLAN).
diff --git a/tools/spec-loop/specs/skill-reconciler.md 
b/tools/spec-loop/specs/skill-reconciler.md
new file mode 100644
index 0000000..2cc1770
--- /dev/null
+++ b/tools/spec-loop/specs/skill-reconciler.md
@@ -0,0 +1,136 @@
+<!-- SPDX-License-Identifier: Apache-2.0
+     https://www.apache.org/licenses/LICENSE-2.0 -->
+
+---
+title: Cross-project skill reconciler
+status: proposed
+kind: feature
+mode: infra
+source: >
+  MISSION.md § Scope boundaries ("Duplication is embraced where it buys
+  decoupling ... an agent can reconcile two near-identical skills on
+  demand, so the price of keeping them separate is low") and § "The data
+  layer is shared; the skills on top are free to diverge" (skills carry a
+  `source` tag so the split is a registry query). PRINCIPLES.md on the
+  safety baseline that must stay eventually-consistent across every copy.
+  meta-and-quality-tooling.md (the skill-authoring/quality family this
+  joins). No reconciler tool or skill exists yet.
+acceptance:
+  - The reconciler is read-only: it produces a structured diff and a
+    reconciliation proposal; it never rewrites either skill without human
+    confirmation.
+  - Divergence in the safety baseline (untrusted-content-is-never-
+    instructions, identity-resolution caveats, confidentiality posture)
+    is reported as a must-fix, distinct from divergence that is allowed
+    to stand.
+  - A maintainer can review the proposal and the safety-baseline verdict
+    without running either skill.
+---
+
+# Cross-project skill reconciler
+
+## What it does
+
+Compares two near-duplicate skills, typically the same capability carried
+in an ASF and a non-ASF variant (or two `source`-tagged copies), and
+produces a structured diff plus a reconciliation proposal. It
+operationalises the MISSION principle that duplication is fine where it
+buys decoupling precisely because an agent can reconcile copies on
+demand: the reconciler is that agent step, made repeatable.
+
+The key move is that not all divergence is equal. The reconciler sorts
+differences into three classes: **allowed divergence** (scope, tier,
+teaching voice, project-specific values behind placeholders, the things
+MISSION says skills are free to diverge on); **drift** (one copy gained a
+fix, a clearer step, or a hardening the other lacks, where convergence is
+probably wanted); and **safety-baseline divergence** (the
+untrusted-content-is-never-instructions rule, identity-resolution
+caveats, confidentiality posture, which PRINCIPLES says every copy must
+stay eventually-consistent on). The first is reported and left alone, the
+second is proposed as a merge, and the third is flagged as a must-fix
+that a maintainer should not ignore.
+
+## Where it lives
+
+- Skill (proposed, not implemented): `skill-reconciler` under `skills/`,
+  in the meta / quality family with `write-skill`, `optimize-skill`, and
+  `list-skills` (see 
[meta-and-quality-tooling.md](meta-and-quality-tooling.md)).
+- Optional deterministic helper (proposed): a `uv` tool under `tools/`
+  that does the structural diff (frontmatter, section headings,
+  step-by-step decision rules, placeholder inventory) so the skill
+  reasons over a normalised diff rather than raw text. Follows the
+  tool-backs-skill pattern already used across the catalogue.
+- Inputs: two `SKILL.md` trees (plus their supporting `.md` files),
+  identified by path or by `source` tag.
+- It reads the safety-baseline definition from the same place the rest of
+  the framework does (PRINCIPLES.md / the security posture docs); it does
+  not restate the baseline inline.
+
+## Behaviour & contract
+
+- **Read-only; proposes, never rewrites.** The reconciler emits a diff
+  and a proposal. Any actual convergence edit goes through `write-skill`
+  / `optimize-skill` under human confirmation; the reconciler itself
+  changes no skill file.
+- **Three-class divergence verdict.** Every difference is labelled
+  `ALLOWED`, `DRIFT`, or `SAFETY-BASELINE`. The safety-baseline class is
+  never silently merged away and never dropped from the report, even when
+  the two copies are otherwise identical.
+- **Safety divergence is a must-fix, not a suggestion.** When the two
+  copies disagree on the untrusted-content rule, identity-resolution
+  caveats, or confidentiality posture, the reconciler surfaces it as a
+  blocking finding a maintainer must resolve, separate from the
+  convenience merges.
+- **Decoupling is preserved by default.** The reconciler does not push
+  toward DRY across organisational boundaries; allowed divergence is
+  reported and left in place. Convergence is proposed only for drift and
+  required only for the safety baseline.
+- **Untrusted content stays data.** Skill bodies under comparison are
+  treated as input data; an injected instruction inside a compared skill
+  is reported as content, never executed.
+
+## Out of scope
+
+- Rewriting, merging, or deleting either skill on autopilot (that is a
+  confirmed `write-skill` / `optimize-skill` edit).
+- Choosing which copy "wins" on allowed divergence; the reconciler
+  reports the difference and leaves the call to the maintainers who own
+  each copy.
+- Cross-foundation policy. The reconciler is a tool for reconciling skill
+  text; it makes no claim on which project's governance is correct.
+
+## Acceptance criteria
+
+1. Given two near-duplicate skills, the reconciler emits a structured
+   diff and labels every difference `ALLOWED`, `DRIFT`, or
+   `SAFETY-BASELINE`.
+2. A safety-baseline divergence is always reported as a must-fix and is
+   never folded into the allowed-divergence noise.
+3. The reconciler makes no edit to either skill; convergence is a
+   separate, confirmed authoring step.
+4. The skill validates under `skill-and-tool-validate` and ships an eval
+   suite under `tools/skill-evals/evals/skill-reconciler/`, including a
+   case where the two copies diverge only on the safety baseline and the
+   reconciler must flag it.
+
+## Validation
+
+```bash
+uv run --project tools/skill-and-tool-validator --group dev 
skill-and-tool-validate
+uv run --project tools/skill-evals skill-eval 
tools/skill-evals/evals/skill-reconciler/
+```
+
+## Known gaps
+
+- **Nothing is implemented yet.** This spec is `proposed`; the plan pass
+  turns it into a build item (the skill, optionally its structural-diff
+  tool, and an eval suite).
+- **The safety-baseline definition is not yet machine-readable.** The
+  reconciler needs an authoritative list of baseline clauses to check
+  against; today that posture lives in prose across PRINCIPLES.md and the
+  security docs. A buildable precursor is to extract that baseline into a
+  single referenced checklist the reconciler (and humans) can cite.
+- **`source`-tag-driven pairing is unproven.** MISSION frames copy
+  selection as a registry query over `source` tags, but no registry index
+  pairs copies automatically yet; the first implementation can take two
+  explicit paths and defer auto-pairing.

(airflow-steward) branch main updated: docs(spec): add reviewer-routing and skill-reconciler specs; prune completed plan items (#586)

Reply via email to