(groovy) branch master updated: AI readiness: consolidate human and AI docs

paulk Tue, 12 May 2026 12:47:56 -0700

This is an automated email from the ASF dual-hosted git repository.

paulk-asert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/groovy.git



The following commit(s) were added to refs/heads/master by this push:
     new ca916452b7 AI readiness: consolidate human and AI docs
ca916452b7 is described below

commit ca916452b7c32f7090c4add0df90bd838ca70e82
Author: Paul King <[email protected]>
AuthorDate: Wed May 13 05:46:55 2026 +1000

    AI readiness: consolidate human and AI docs
---
 .agents/skills/groovy-reassess/SKILL.md   | 154 ++++++++++++++++++---
 .agents/skills/groovy-reproducer/SKILL.md | 219 ++++++++++++++++++++++++++----
 ARCHITECTURE.md                           |  52 +++++++
 CONTRIBUTING.md                           | 142 +++++++++++++++++--
 4 files changed, 510 insertions(+), 57 deletions(-)

diff --git a/.agents/skills/groovy-reassess/SKILL.md 
b/.agents/skills/groovy-reassess/SKILL.md
index cfbd0b8b58..dd86ca22d8 100644
--- a/.agents/skills/groovy-reassess/SKILL.md
+++ b/.agents/skills/groovy-reassess/SKILL.md
@@ -18,7 +18,7 @@
 -->
 ---
 name: groovy-reassess
-description: Running a bulk reassessment campaign over old GROOVY JIRA issues 
— narrow JQL selection, per-issue reproducer extraction and execution via 
`groovy-reproducer`, classification (`fixed-on-master` / `still-fails-same` / 
`still-fails-different` / `cannot-run-*` / `intended-behaviour` / 
`duplicate-of-resolved` / `timeout`), structured report and per-issue evidence 
package, and a strict hand-back contract — no JIRA comments, no transitions, no 
closures posted on behalf of the proj [...]
+description: Running a bulk reassessment campaign over old GROOVY JIRA issues 
— narrow JQL selection (Reopened pool surfaces wishlists; Open+EOL surfaces 
silent fixes and real bugs), per-issue reproducer extraction and execution via 
`groovy-reproducer`, classification (`fixed-on-master` / `still-fails-same` / 
`still-fails-different` / `cannot-run-*` / `intended-behaviour` / 
`duplicate-of-resolved` / `timeout`), orthogonal `nature` analysis 
(`bug-as-advertised` vs `feature-request-disguis [...]
 license: Apache-2.0
 compatibility: claude, codex, copilot, cursor, gemini, aider
 metadata:
@@ -164,10 +164,24 @@ These are the recurring mistakes at the campaign level:
 Selection drives the campaign's quality. Pick a *narrow*, *bounded*
 slice; do not boil the ocean.
 
+**Pool choice matters.** The pool-selection heuristics — which
+status / affected-version slices over-represent which natures —
+live in [`CONTRIBUTING.md`'s "Pool-selection heuristics for
+re-triage 
sweeps"](../../../CONTRIBUTING.md#pool-selection-heuristics-for-re-triage-sweeps).
+Choose the pool deliberately based on what the campaign is hunting
+for.
+
 JQL building blocks come from
 [`groovy-jira`](../groovy-jira/SKILL.md). Useful slices for
 reassessment:
 
+- **Reopened (wishlist hunting):** `project = GROOVY AND status =
+  Reopened ORDER BY updated ASC`. Small, well-bounded pool.
+- **Open + EOL bugs (silent-fix hunting):** `project = GROOVY AND
+  status = Open AND issuetype = Bug AND affectedVersion in
+  ("1.5.6", "1.6.0", "1.6.5", "1.7.0", "1.8.0") ORDER BY created
+  ASC`. The version list should target releases before a known
+  major refactor of the relevant subsystem.
 - **Age bucket × open:** `project = GROOVY AND statusCategory != Done
   AND created < "2020/01/01" AND created >= "2018/01/01"
   ORDER BY created ASC`. Buckets of two years are scannable; ten
@@ -176,9 +190,6 @@ reassessment:
   != Done AND component = "<X>" AND updated < -730d`. Pairs well
   with an area you know — your fix-side strength shapes the
   candidate set.
-- **Affected version end-of-life:** `project = GROOVY AND
-  statusCategory != Done AND affectedVersion = "2.4.x"` — versions
-  long out of support are high-yield for `fixed-on-master`.
 - **No component (triage-then-reassess):** `project = GROOVY AND
   statusCategory != Done AND component is EMPTY`. The reassessment
   can also produce a `Component/s` suggestion (see
@@ -186,9 +197,9 @@ reassessment:
 
 Cap the per-session set. A practical first pilot is 5–10 issues
 spanning *different reproducer shapes* (one runnable script, one
-attachment, one prose-only, one `@Grab`, one comment-with-snippet)
-so the pipeline meets each shape early. Pilots beat the first
-hundred issues you'd naturally pick.
+attachment, one prose-only-but-precise, one `@Grab`,
+one comment-with-snippet) so the pipeline meets each shape early.
+Pilots beat the first hundred issues you'd naturally pick.
 
 ## Procedure
 
@@ -205,21 +216,90 @@ For each campaign session:
    `original.<ext>`, `run.log`, `verdict.json`. This is *not* in
    the Groovy checkout.
 3. **For each issue, in order:**
-   - Skip if its `verdict.json` already exists and is well-formed
-     (resumability).
-   - Read the JIRA issue and skim comments for an obvious
-     "already fixed" / "won't fix" / "see GROOVY-XXXX" — early
-     classifications save time.
-   - Hand off to [`groovy-reproducer`](../groovy-reproducer/SKILL.md)
-     for extraction, adaptation, running, and evidence capture.
-   - Read the classification from `verdict.json`.
-   - Reset the working tree before the next issue.
+
+   **a. Resumability check.** Skip if its `verdict.json` already
+   exists and is well-formed. The per-issue evidence files on
+   disk are the resumption point — in-memory campaign state is
+   not.
+
+   **b. Triage the issue per
+   [`CONTRIBUTING.md`'s "Triaging a JIRA 
issue"](../../../CONTRIBUTING.md#triaging-a-jira-issue).**
+   That procedure is the canonical methodology — read the thread
+   (with historical baselines), search for duplicates and
+   same-family related work, attempt reproduction (including
+   per-reproducer execution when the thread contains multiple
+   distinct reproducers), locate the code, check JIRA fields,
+   search for documented workarounds, and assess split
+   candidacy. The bulk of the per-issue work is in that section.
+
+   **AI-specific recording** during this step — these are the
+   campaign's value-adds on top of the manual methodology:
+   - Historical baselines from comments → `verdict.json.cases[].history`
+   - Related-JIRA citations from the JQL scan → `verdict.json.notes`
+   - Per-reproducer outcomes (when multiple reproducers exist) →
+     `verdict.json.cases` (multi-case array)
+   - Workaround-search outcome and the close-path it implies →
+     `verdict.json.notes`
+   - Split-candidacy recommendation → `verdict.json.notes`
+
+   **c. Hand off to
+   [`groovy-reproducer`](../groovy-reproducer/SKILL.md)** for the
+   reproducer extraction, adaptation, running, and evidence
+   capture. This includes the optional cross-family probe when
+   the operation under test spans multiple types or operator
+   variants — see
+   [ARCHITECTURE.md "Operator 
families"](../../../ARCHITECTURE.md#operator-families)
+   for the family taxonomies.
+
+   **d. Apply nature analysis.** Per
+   [`CONTRIBUTING.md`'s "Nature 
analysis"](../../../CONTRIBUTING.md#nature-analysis-bug-as-advertised-vs-wouldnt-it-be-nice),
+   determine whether the issue is `bug-as-advertised`,
+   `feature-request-disguised-as-bug`, `intended-and-documented`,
+   etc. Record in `verdict.json.nature`. This is orthogonal to
+   `classification` (a still-reproducing wishlist and a
+   still-reproducing bug recommend different actions).
+
+   **e. Compose the final `verdict.json`.** Classification +
+   nature + cases (if multi-case) + cross-type-probe /
+   operator-variants-probe (if run) + notes with the close-path
+   recommendation. See
+   [`groovy-reproducer`](../groovy-reproducer/SKILL.md)'s
+   Evidence package section for the schema.
+
+   **f. Reset the working tree** before the next issue.
 4. **After the loop, build the report** (see below). Do *not*
    post anything.
 5. **Hand back** to the human — branch (if any local Groovy
    commits, e.g. adapted `@Test` files kept for follow-up),
    scratch corpus path, report path.
 
+## Nature analysis (AI-recording layer)
+
+The nature taxonomy and the "bug-as-advertised vs wouldn't-it-be-
+nice" question live in
+[`CONTRIBUTING.md`'s "Nature analysis" 
section](../../../CONTRIBUTING.md#nature-analysis-bug-as-advertised-vs-wouldnt-it-be-nice).
+That section is the canonical source; apply it per-issue during
+procedure step 3d above.
+
+This skill's contribution is to **record** the result in
+`verdict.json.nature` so the campaign-level aggregator can roll
+up nature classifications across the issue set (e.g. "of N
+issues reassessed, M classified as
+`feature-request-disguised-as-bug` — those rows recommend
+re-typing rather than fixing"). The allowed values:
+
+- `bug-as-advertised`
+- `bug-as-advertised-partial-fix` (paired with a split
+  recommendation in `notes`)
+- `feature-request` (correctly typed as Improvement; no re-type
+  needed)
+- `feature-request-disguised-as-bug` (Bug → Improvement re-type
+  recommendation)
+- `intended-and-documented`
+
+See the verdict-template `_field_help` for the schema. Apply
+exactly one value per verdict — `nature` is required.
+
 ## Classification taxonomy
 
 The campaign uses the per-issue classifications produced by
@@ -282,6 +362,19 @@ The campaign produces:
   for a real regression test).
 - A short verbal summary: "Swept N issues. M `fixed-on-master`, K
   `still-fails-same`. Headlines: GROOVY-A, GROOVY-B, …."
+- **Split recommendations** for any issue where the reassessment
+  surfaced multiple cases with distinct fates (some silently fixed,
+  some still failing) or distinct user-visible symptoms.
+  Format: "close GROOVY-X with per-case summary; open new JIRA
+  for GROOVY-X-residual-A (the remaining case)."
+- **New-JIRA candidates** surfaced by cross-family probes — when
+  probing a sibling type or operator variant uncovered a bug the
+  original report didn't contain, flag it as a candidate for its
+  own JIRA. The verdict notes should sketch the 4-line reproducer.
+- **Documentation candidates** — issues where the close path is
+  "Not A Bug + add docs" (workaround exists but undocumented; spec
+  gap; behaviour-by-design but surprising). The report should
+  list each docs deliverable with a one-line scope.
 
 The campaign does **not**:
 
@@ -311,25 +404,48 @@ the project communicates.
 
 Before declaring a campaign session complete:
 
+- [ ] **Pool was chosen deliberately** (Reopened vs Open+EOL) and
+      the choice matches the campaign's goal (spec-debate review vs
+      silent-fix hunting).
 - [ ] Candidate set was bounded *before* the loop started; the
       bound is recorded in the report.
 - [ ] Per-issue evidence package on disk for every candidate, with
       `verdict.json` and a non-empty `description.md`.
-- [ ] Every classification used is one of the taxonomy entries; no
-      free-form labels.
+- [ ] Every `classification` used is one of the taxonomy entries;
+      every `nature` used is one of the nature values; no free-form
+      labels.
+- [ ] **`nature` populated** on every verdict — orthogonal to
+      classification.
+- [ ] **Historical baselines from the comment thread** are recorded
+      in `cases[].history` where applicable, so the headline
+      finding can be "unchanged since baseline X" rather than just
+      "current state is Y."
+- [ ] **Workaround search done** for every still-fails-* verdict
+      (`src/spec/doc/` + Groovydoc + JIRA comments). The
+      recommendation distinguishes "close as Not A Bug" from
+      "close + add docs."
+- [ ] **Related-JIRA scan** done for every verdict; same-family
+      issues cited in notes.
 - [ ] `fixed-on-master` rows include the rev and JDK in the
       evidence; the verdict isn't over-claimed.
 - [ ] `still-fails-same` rows are surfaced at the top of the
       report, not buried.
 - [ ] `cannot-run-*` rows have a concrete reason in the evidence
       (which dep failed, what was missing) — not "could not run."
+- [ ] **Split candidates flagged** for any issue with multi-case
+      mixed fates or distinct user-visible symptoms.
+- [ ] **Cross-family probe results recorded** in
+      `cross_type_probe` or `operator_variants_probe` when the
+      probe was run; any sibling-type bugs are flagged as new-JIRA
+      candidates.
 - [ ] No JIRA mutation occurred. No PR was opened. No dev@ post
       was sent.
 - [ ] The report opens with a summary stanza a committer can scan
       in 30 seconds.
 - [ ] Working tree was clean at the end of the session.
 - [ ] Hand-back artefact lists the scratch corpus path, the report
-      path, any local commits worth keeping, and the recommended
+      path, any local commits worth keeping, any split or
+      documentation candidates surfaced, and the recommended
       publication path (typically a single dev@ thread).
 
 ## References
diff --git a/.agents/skills/groovy-reproducer/SKILL.md 
b/.agents/skills/groovy-reproducer/SKILL.md
index d45b89de89..388232454d 100644
--- a/.agents/skills/groovy-reproducer/SKILL.md
+++ b/.agents/skills/groovy-reproducer/SKILL.md
@@ -107,11 +107,16 @@ reproducers:
    stop. The reporter's specific code is what makes a reproduction
    trustworthy; an agent-written stand-in is a different exercise
    (and a different verdict).
-2. **Skipping the comment thread.** Reporters frequently post a
-   simplified reproducer in a comment after the initial description.
-   Reading only the description misses it. Inventory every code
-   block in the description *and* every comment *and* every
-   attachment before picking a candidate.
+2. **Skipping the comment thread, or only running the headline
+   reproducer.** Reporters frequently post a simplified reproducer
+   in a comment after the initial description, and may follow up
+   with additional cases that exercise different symptoms of the
+   same root cause. Inventory every code block in the description
+   *and* every comment *and* every attachment, and when distinct
+   reproducers exist, **run each and record per-reproducer
+   outcomes** — not just the headline one. The `cases` array in
+   `verdict.json` (see Evidence package below) carries
+   per-reproducer state for multi-case issues.
 3. **Treating `@Test` adaptation as equivalent to a script run.**
    Groovy scripts and class methods have different scoping (script
    bindings vs. fields, implicit `main`, `def` vs. typed locals).
@@ -172,6 +177,26 @@ reproducers:
     `~/.groovy/grapes/` (see GROOVY-12005). For a campaign,
     consider a per-sweep Grape root via `-Dgrape.root=<scratch>` so
     the user's everyday cache stays clean.
+14. **Treacherous substring matching in verification logic.** Same
+    trap covered in
+    [`CONTRIBUTING.md`'s "Test-writing 
pitfalls"](../../../CONTRIBUTING.md#test-writing-pitfalls)
+    — applies equally to reproduction verification scripts.
+    Substring matching near common prefixes (`xs` / `xsi`,
+    `groovy` / `groovy-`) silently produces false positives.
+    Prefer anchored regex or parsed-tree inspection. The
+    "verify identifiers" discipline applies to the verification
+    logic itself, not just the code under test — almost-shipped
+    false `fixed-on-master` results have hit this trap.
+15. **Reproducer-stale-due-to-API-evolution treated as a bug.**
+    Old reproducers may use classes that have moved or been
+    removed. A `ClassNotFoundException` on import isn't the
+    reporter's bug — it's mechanical adaptation territory. The
+    canonical mapping of class moves lives in the release notes
+    (Groovy 3.0 split-packages section is the largest); see
+    [ARCHITECTURE.md "Operator 
families"](../../../ARCHITECTURE.md#operator-families)
+    for the project-side context. Add the new import per that
+    mapping; don't classify as `still-fails-different` or
+    `cannot-run-environment`.
 
 ## Reproducer shape taxonomy
 
@@ -203,8 +228,23 @@ Recipe: classify `cannot-run-extraction`. The stack trace 
is a *hint
 about the area*, not a reproducer. Don't construct code to "make
 that stack trace appear"; that's fabrication.
 
-**E. Prose-only** — natural-language description, no code. Recipe:
-`cannot-run-extraction`. Don't write code from prose.
+**E-vague. Prose-only, no precise testable claim** —
+natural-language description without a specifiable behaviour
+("ConfigObject sometimes behaves weirdly"). Recipe:
+`cannot-run-extraction`. Don't write code from vague prose.
+
+**E-precise. Prose-only, but the prose IS a specifiable claim** —
+the description contains an algebraic / specifiable claim with no
+verbatim code but enough precision to construct a faithful test
+(e.g. *"`x?.y?.z` returns null on Maps but throws on POGOs"*).
+Recipe: construct a reproducer that tests **exactly that claim**
+— instantiate the explicit assertion, do not interpolate beyond
+it. Classify normally per the outcome. The distinction from
+fabrication: E-precise is *instantiation of an explicit claim*
+(the prose IS the spec); fabrication is *guessing at inputs,
+structure, or APIs the reporter didn't specify*. If the
+construction would require either, classify `cannot-run-extraction`
+and stop.
 
 **F. Attachment** — `.groovy`, `.java`, `.zip` (project), `.txt`
 (log), `.gz` (heap dump, etc.). For `.groovy` / `.java` files,
@@ -246,11 +286,36 @@ For each reproducer:
      (guessing a type, inventing a missing variable), stop and
      classify `cannot-run-extraction` with a note about what was
      missing.
-   - Shapes D, E: classify `cannot-run-extraction`; don't adapt.
+   - Shape D: classify `cannot-run-extraction`; don't adapt.
+   - Shape E-vague: classify `cannot-run-extraction`; don't write
+     code from prose without a precise claim.
+   - Shape E-precise: construct a reproducer that tests **only**
+     the explicit claim the prose makes. Cite the prose verbatim
+     in the comment header so the construction is auditable.
    - Shape F: per inner shape; for project zips, classify
      `needs-separate-workspace`.
    - Shape G: copy verbatim; flag for Grape-aware running.
    - Shape H: classify `needs-separate-workspace`.
+
+   **API-evolution adaptation.** Old reproducers may not compile
+   on modern Groovy because classes moved or were removed. This is
+   mechanical adaptation — *not* fabrication — when the move is
+   documented in the release notes. The Groovy 3.0 split-packages
+   refactor is the largest such reshuffle; see
+   [ARCHITECTURE.md "Operator 
families"](../../../ARCHITECTURE.md#operator-families)
+   for the project-side context (and the release-notes link there
+   for the canonical mapping).
+
+   When you make an adaptation under this rule:
+   - The body of the reproducer stays unchanged — only imports /
+     package references shift.
+   - Cite the release-notes section in `verdict.json.notes` so the
+     adaptation is auditable.
+   - If the adaptation requires *behavioural* changes (not just
+     imports) — e.g. a method signature changed — that's a
+     different classification: the reporter's claim might be
+     `still-fails-different` (if the new API behaves differently)
+     or you may need to escalate to `needs-info`.
 5. **Build the current Groovy distribution** if the reproducer is a
    script that needs the produced `groovy` binary. For `@Test`-shape
    reproducers, the Gradle test invocation handles the build.
@@ -262,10 +327,29 @@ For each reproducer:
    substring" is `same-failure`. "Fails with something different" is
    `different-failure`. "Doesn't fail" is `passes`. "Hangs past
    timeout" is `timeout`. "Errors before exercising the path" is
-   `cannot-run-*`.
-8. **Record the evidence package** before doing anything else.
-9. **Reset the working tree** if you adapted as a `@Test` (the
-   added file must not leak to the next issue).
+   `cannot-run-*`. For multi-case reproducers (a list of
+   assertions, a Shape-E-precise probe across backends), record
+   per-case state in `verdict.json.cases` so partial-fix patterns
+   are queryable — see Evidence package below.
+8. **Scan the JIRA's comment thread for historical baselines.** A
+   committer's prior "I just ran this on version X, here's what I
+   got" comment is a baseline worth comparing against, not just
+   the original report's claim. If found, record each baseline in
+   `verdict.json.cases[].history` (year, status, source). The
+   headline finding may be "the state hasn't changed since this
+   committer's baseline" rather than "the state is X today."
+9. **(Optional) Cross-family probe.** When the reproducer
+   exercises a behaviour defined for multiple backing types or via
+   multiple operator variants, run a quick probe across the
+   family — see *Cross-family probes* below. The probe is cheap
+   and consistently surfaces signal beyond the reporter's
+   framing (a project-wide spec gap, an additional bug in a
+   sibling type, or confirmation that the asymmetry spans the
+   whole operator family). Record results in
+   `verdict.json.cross_type_probe` or `.operator_variants_probe`.
+10. **Record the evidence package** before doing anything else.
+11. **Reset the working tree** if you adapted as a `@Test` (the
+    added file must not leak to the next issue).
 
 ## Run posture
 
@@ -275,8 +359,9 @@ For each reproducer:
   failures get the `cannot-run-dependency` classification; they are
   not "fixed."
 - **Filesystem:** scratch directory per issue, under
-  `~/work/groovy-reassessment/<KEY>/` (or wherever the campaign
-  layout puts it — see [`groovy-reassess`](../groovy-reassess/SKILL.md)).
+  `~/work/groovy-reassess/<campaign-id>/<JIRA-KEY>/` (or wherever
+  the campaign layout puts it — see
+  [`groovy-reassess`](../groovy-reassess/SKILL.md)).
   Don't write under the Groovy checkout.
 - **Working tree:** clean between reproducers. The
   added-and-then-removed `@Test` is the most common leak source.
@@ -287,6 +372,52 @@ For each reproducer:
   matters (`passes`, `fixed-on-master`), retry on the
   originally-affected JDK via Gradle toolchains where reasonable.
 
+## Cross-family probes (AI-tooling pattern)
+
+When the reproducer exercises a behaviour defined for **multiple
+backing types** or **multiple operator variants** in the language,
+probe the others. The pattern is cheap (~50 line script per
+family) and consistently surfaces signal beyond the reporter's
+framing.
+
+The **family taxonomies** (type families like `List`/`Object[]`/
+primitive arrays/`String`; operator-variant families like the
+three safe-navigation variants) live in
+[ARCHITECTURE.md "Operator 
families"](../../../ARCHITECTURE.md#operator-families).
+That section is the canonical reference for what to probe across
+and why the family members behave as they do (dispatch paths,
+known asymmetries). Apply it during procedure step 9.
+
+This skill's contribution is the **AI-tooling pattern for
+*running* the probe**: a small Groovy script that exercises each
+family member, emits a comparison table, and gets saved alongside
+`reproducer.<ext>` as `cross-type-probe.groovy` or
+`operator-variants-probe.groovy`.
+
+Probe template structure:
+
+```groovy
+def probes = [
+    'Member A' : { -> /* construct backend A, exercise the expression */ },
+    'Member B' : { -> /* same expression on backend B */ },
+    // ...
+]
+probes.each { name, body ->
+    def outcome
+    try { outcome = body() } catch (Throwable t) { outcome = "THREW: 
${t.class.simpleName}" }
+    println String.format("%-20s | %s", name, outcome)
+}
+```
+
+Record results in `verdict.json.cross_type_probe` /
+`.operator_variants_probe` (see Evidence package below).
+
+**Sanity check:** if the probe surfaces a *new* bug in a sibling
+type that the original report didn't mention, that often
+warrants its own JIRA (the verdict note should flag the new-JIRA
+candidate). The original issue's verdict still reflects the
+original report; the sibling-type bug is a separate finding.
+
 ## Evidence package
 
 For each reproducer run, persist:
@@ -298,19 +429,53 @@ For each reproducer run, persist:
 - `original.<ext>` — the literal source from JIRA, untouched, when
   extracted.
 - `run.log` — stdout + stderr from the run, with the exact command
-  on the first line.
-- `verdict.json` — `{ "key": "GROOVY-NNNNN", "shape": "<A|B|…>",
-  "classification": "<one of: same-failure | different-failure |
-  passes | cannot-run-extraction | cannot-run-environment |
-  cannot-run-dependency | timeout | needs-separate-workspace>",
-  "rev": "<short-sha>", "jdk": "<vendor+version>",
-  "command": "<verbatim>", "runtime-ms": <int>,
-  "exit-code": <int>, "matched-original-failure": <bool>,
-  "notes": "<short>" }`.
-
-This package is what a committer needs to trust the verdict, and it
-is what [`groovy-reassess`](../groovy-reassess/SKILL.md) feeds into
-its report.
+  on the first line plus `rev`, `jdk`, started/ended timestamps.
+- `cross-type-probe.<groovy|log>` and/or
+  `operator-variants-probe.<groovy|log>` — optional, when a
+  cross-family probe was run (see *Cross-family probes* above).
+- `cross-type-probe-findings.md` — optional, when the probe
+  surfaced project-wide signal worth surfacing separately.
+- `verdict.json` — the structured classification. Schema:
+
+```json
+{
+  "key": "GROOVY-NNNNN",
+  "shape": "A | B | C | D | E-vague | E-precise | F | G | H",
+  "classification": "fixed-on-master | still-fails-same | 
still-fails-different | cannot-run-extraction | cannot-run-environment | 
cannot-run-dependency | timeout | intended-behaviour | duplicate-of-resolved | 
needs-separate-workspace",
+  "nature": "bug-as-advertised | bug-as-advertised-partial-fix | 
feature-request | feature-request-disguised-as-bug | intended-and-documented",
+  "rev": "<short-sha>",
+  "jdk": "<vendor + version>",
+  "command": "<verbatim>",
+  "runtime_ms": <int or null>,
+  "exit_code": <int>,
+  "matched_original_failure": <bool>,
+  "cases": [                                      // optional; multi-case 
reproducers only
+    {
+      "expr": "<expression / sub-case>",
+      "expected": "<expected outcome>",
+      "actual_master": "<observed on master>",
+      "match_on_master": <bool>,
+      "history": [{"year": <int>, "status": "...", "source": "..."}],
+      "note": "<short>"
+    }
+  ],
+  "cases_summary": "<one-line roll-up>",          // optional
+  "cross_type_probe": { "file": "...", "log": "...", "findings": "...", 
"summary": "..." },     // optional
+  "operator_variants_probe": { "file": "...", "log": "...", "summary": "..." 
},                 // optional
+  "notes": "<long-form analysis and recommendation>"
+}
+```
+
+Keys use **snake_case** (`runtime_ms`, not `runtime-ms`) so
+`jq` queries don't need quoting. The `nature` field is
+orthogonal to `classification` and answers the question "is
+this not operating as advertised, or is this wouldn't-it-be-nice?"
+— see [`groovy-reassess`](../groovy-reassess/SKILL.md) for how
+the campaign uses it.
+
+This package is what a committer needs to trust the verdict, and
+it is what [`groovy-reassess`](../groovy-reassess/SKILL.md) feeds
+into its report.
 
 ## Validation checklist
 
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index b2c78e83a6..216e303c4c 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -228,6 +228,58 @@ above. Each bites contributors quickly if missed:
   `INSTRUCTION_SELECTION` or later. See the
   [Compilation pipeline](#compilation-pipeline) phase table.
 
+## Operator families
+
+Several Groovy operators and expression forms are defined for
+**multiple backing types**, and behaviour across the members of a
+family is sometimes inconsistent. When investigating a bug
+reported for one type, probing the same expression across siblings
+often surfaces nuance the reporter missed — a hidden bug in a
+sibling type, confirmation that an asymmetry spans the whole
+family, or a project-wide spec gap that wasn't visible from a
+single-type report.
+
+### Type families
+
+| Family | Members | Notes |
+|---|---|---|
+| **Range / index operators** (`agg[idx]`, `agg[range]`) | `List` / `Object[]` 
/ primitive arrays (`int[]`, `long[]`, …) / `String` / `CharSequence` | 
Different exception classes (`IndexOutOfBoundsException` vs 
`ArrayIndexOutOfBoundsException` vs `StringIndexOutOfBoundsException`). 
Negative-endpoint and out-of-range-negative semantics have historically 
diverged across types — see GROOVY-3974 for a concrete example surfaced by 
cross-type probing. |
+| **GPath expressions** (`x.y.z`, `x?.y`, `x*.y`) | In-memory (Map, List, 
nested combinations) / JSON (`JsonSlurper`) / XML (`XmlSlurper` / `XmlParser`) 
/ POGO / Java POJO / SQL result sets (`groovy.sql.Sql`) | XML has special 
handling for attributes (`@attr` syntax) and returns empty `NodeChild` 
collections on missing children rather than null. Map/JSON return null on 
missing keys. POGOs and POJOs throw `MissingPropertyException` for missing 
properties — the asymmetry is by-design (each [...]
+| **Numeric coercion** (`+`, `-`, `*`, `/`, comparison) | `int` / `long` / 
`BigInteger` / `BigDecimal` / `double` / `Float` / `Long` (boxed) | Coercion 
rules vary; the result type of `int + BigDecimal` may surprise. |
+
+### Operator-variant families
+
+Some operators have **multiple syntactic variants** that share a
+family but dispatch differently:
+
+| Family | Variants | Dispatch notes |
+|---|---|---|
+| **Safe navigation** | `?.` (SAFE_DOT) / `??.` (SAFE_CHAIN_DOT — shorthand 
for chained `?.`) / `?[..]` (SAFE_INDEX) | `?.` and `??.` call 
`getProperty(String)`. `?[..]` calls `getAt(Object)`, but on POGOs that routes 
through `getProperty` for missing keys, so the variants behave identically for 
POGO missing-property access. |
+| **Spread** | `*.` / `*[..]` / `*:` | Different unpacking semantics across 
iteration / indexing / map-merge. |
+| **Equality / identity** | `==` / `.equals()` / `is` | `==` is `equals`-based 
in Groovy (not reference-equality as in Java); `is` is Java's `==` (reference). 
|
+| **Coercion** | `as` / `asType()` / constructor + `from` | Different 
conversion paths; `as` is statically-resolvable, `asType` is dynamic. |
+| **Range** | `..` / `..<` / `..>` | Endpoint inclusion / direction 
differences. |
+| **Elvis / null-coalesce** | `?:` and elaborations | Truthy-vs-null 
differences in the left-hand side. |
+
+### Why this matters for investigation
+
+For an investigation of a bug in one family member, probing across
+siblings is a recurring technique. A ~50-line probe script
+(constructing each backend, running the same expression, recording
+outcomes in a table) is usually enough to:
+
+- confirm whether an asymmetry the reporter found spans the family
+  or is type-specific;
+- surface a hidden bug in a sibling type the reporter didn't test
+  (and which may warrant its own JIRA);
+- reveal that what looks like a bug is actually consistent
+  documented behaviour with a documented or implicit workaround in
+  a sibling form.
+
+See `.agents/skills/groovy-reproducer/SKILL.md`'s "Cross-family
+probes" section for the AI-tooling pattern. The probe approach is
+equally useful when investigating by hand.
+
 ## Generated code
 
 The following are produced by the build and regenerated on every
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 90256ca040..4495071e91 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -318,6 +318,18 @@ platform issues but are actually project-specific:
   Without forwarding, the gated test always skips even with
   `-Djunit.network=true` on the Gradle CLI.
 
+- **Treacherous substring matching in verification logic.** When
+  scripting verification (e.g. checking whether some token
+  survived a transformation), plain `.contains()` can silently
+  produce false positives near common prefixes —
+  `output.contains('xmlns:xs')` matches `xmlns:xsi` as a prefix.
+  Prefer anchored regex (`output =~ /xmlns:xs="/`) or parsed-tree
+  inspection (`new XmlSlurper().parseText(output)`) over substring
+  matching. The trap also bites verification logic written for
+  reassessments and triage probes, not just tests; the principle
+  applies anywhere you're matching tokens with shared prefixes
+  (`xs` / `xsi`, `groovy` / `groovy-`).
+
 ### For agents working on tests
 
 The 
[`.agents/skills/groovy-tests/SKILL.md`](.agents/skills/groovy-tests/SKILL.md)
@@ -506,6 +518,28 @@ project = GROOVY AND statusCategory = Done AND resolution 
is EMPTY
 project = GROOVY AND parent = GROOVY-<NNNN>
 ```
 
+#### Pool-selection heuristics for re-triage sweeps
+
+When picking a pool of issues to work through (whether by hand or
+in a tool-assisted sweep), the pool's *status mix* shapes what
+you'll find:
+
+- **`status = Reopened`** — small pool, over-represents
+  *feature-requests-disguised-as-bugs*: issues someone re-examined
+  and left open while pondering a spec change. Useful when the
+  goal is *"what spec debates is the project sitting on?"*.
+- **`status = Open AND affectedVersion in ("<EOL-versions>")`** —
+  larger pool, over-represents *silent fixes*, *real open bugs*,
+  and *partial fixes*. Useful when the goal is *"find what's been
+  silently resolved by later refactoring."* Target affected
+  versions that pre-date a known major refactor of the relevant
+  subsystem.
+
+The two pools answer different questions; choose deliberately
+rather than by default-sort. See *Nature analysis* under
+[Triaging issues and pull requests](#triaging-issues-and-pull-requests)
+for what the populations look like in practice.
+
 ### Linking JIRA to commits and pull requests
 
 Every commit or pull request that fixes a JIRA-tracked issue
@@ -559,9 +593,17 @@ For each issue:
    reporter included a stack trace, the top non-JDK frame usually
    points at the area of the code involved.
 
-2. **Search for duplicates and related work** before writing a long
-   analysis. A one-line "duplicate of GROOVY-XXXX, fixed in 4.0.Y"
-   beats a 500-word root-cause summary of a known bug.
+   Also scan for **historical baselines**: prior committer comments
+   that say "I just ran this on version X, here's what I got." These
+   are checkpoints the current triage can compare against — the
+   headline finding for an old issue may be "state unchanged since
+   the 2013 committer baseline" rather than the current state alone.
+
+2. **Search for duplicates and same-family related work** before
+   writing a long analysis. A one-line "duplicate of GROOVY-XXXX,
+   fixed in 4.0.Y" beats a 500-word root-cause summary of a known
+   bug. A "GROOVY-YYYY is in the same neighbourhood" pointer helps
+   the committer decide on batch-or-sequential treatment.
 
    ```
    git log --grep='GROOVY-<NNNN>'                # commits referencing the JIRA
@@ -570,7 +612,8 @@ For each issue:
 
    Plus a JQL text search — see the
    [Duplicate hunting by error string](#searching-with-jql) recipe
-   above.
+   above. The same JQL with a topic-keyword surfaces same-family
+   open issues: `project = GROOVY AND text ~ "<topic>"`.
 
 3. **Attempt reproduction on `master`.** Drop the reporter's script
    into a temp file and run it against a local build, or paste their
@@ -581,6 +624,21 @@ For each issue:
    that's now passing on `master` is meaningful information; a
    missing reproduction is speculation.
 
+   When the issue thread contains **multiple distinct reproducers**
+   (the description plus a follow-up comment that demonstrates a
+   different symptom of the same root cause), run each. They may
+   have different fates: one silently fixed, another still broken
+   — that's a *split-candidate* signal (see step 7 below).
+
+   When the operator or expression under test spans **multiple
+   backing types** (range/index on `List` / `Object[]` / `int[]` /
+   `String`; GPath on Map / JSON / XML / POGO / POJO) or has
+   **multiple operator variants** (safe-navigation `?.` / `??.` /
+   `?[..]`), probing the same expression across the family is
+   often cheap and surfaces signal the reporter missed. See the
+   family taxonomies in [`ARCHITECTURE.md`'s "Operator families"
+   section](ARCHITECTURE.md#operator-families).
+
 4. **Locate the code, lightly.** If the failure reaches the
    runtime/compiler, identify the package or class the stack flows
    through — that's the "where" to point a fix at. Don't go deeper
@@ -592,15 +650,47 @@ For each issue:
    [Fields and who sets them](#fields-and-who-sets-them) and
    [Components](#components).
 
-6. **Draft a comment** with: the state of the reproduction
+6. **Search for documented workarounds.** Before recommending a
+   closure path, check three places:
+
+   - `src/spec/doc/` (user-facing docs) — `grep -rn '<topic>' src/spec/doc/`
+   - Groovydoc on the relevant source classes — `grep -B2 -A8` on
+     the source file
+   - JIRA comments — keywords like `workaround`, `prefix`,
+     `coerce`, `use ... instead`
+
+   The outcome shapes the recommended close path:
+
+   - **Workaround documented user-facing** → close as "Not A Bug"
+     or "Won't Fix"; cite the doc.
+   - **Workaround exists but is undocumented user-facing** → close
+     as "Not A Bug **+ add docs**". The documentation deliverable
+     is the actionable artefact; closing without it leaves the
+     surprise intact for the next user.
+   - **No workaround** → keep open OR re-type as Improvement (per
+     nature analysis, below).
+
+7. **Consider split candidacy.** When the issue bundles **multiple
+   reproducers with mixed fates** (some silently fixed, some still
+   broken) or **multiple user-visible symptoms with independently-
+   fixable causes**, recommend a split: close the original with a
+   per-case summary, suggest a focused new JIRA for the remaining
+   unfixed case(s) with the targeted reproducer. Old multi-case
+   JIRAs often resolve partially over time; the constructive close
+   path is a per-case status update plus carry-over.
+
+8. **Draft a comment** with: the state of the reproduction
    (passed / failed / could not run, with revision + JDK), the
    duplicate-search result, the likely area of the code, suggested
-   missing fields, and a recommended next action ("needs a minimal
-   reproducer," "looks fixed on master — propose closing as Cannot
-   Reproduce after a second pair of eyes," "appears to need a fix
-   in `<area>`"). Factual, helpful, specific.
-
-7. **Don't transition the issue.** Even when the recommendation is
+   missing fields, the workaround-search outcome, and a recommended
+   next action ("needs a minimal reproducer," "looks fixed on
+   master — propose closing as Cannot Reproduce after a second pair
+   of eyes," "appears intended-behaviour-but-undocumented; propose
+   closing + docs PR," "appears to need a fix in `<area>`,"
+   "consider splitting — A is fixed, B remains"). Factual, helpful,
+   specific.
+
+9. **Don't transition the issue.** Even when the recommendation is
    clear, leave the workflow state to a committer.
 
 ### Triaging a pull request
@@ -649,6 +739,36 @@ For each PR:
    reformat, then nits. Use file-path:line references so committers
    can jump to each finding.
 
+### Nature analysis: bug-as-advertised vs wouldn't-it-be-nice
+
+Before recommending a close path, ask: **is this not operating as
+advertised, or is this 'wouldn't it be nice if'?** The answer
+shapes the action differently from what the reproducer's outcome
+alone suggests. Two issues can both reproduce verbatim and need
+totally different closures.
+
+| Nature | Meaning | Recommended action |
+|---|---|---|
+| **bug-as-advertised** | A documented or implicit promise isn't being kept. 
The code does not deliver what its signature, Groovydoc, or spec says. | Fix 
it. The reproducer is the regression-test target. |
+| **bug-as-advertised, partial fix** | Originally multi-case; some cases 
silently fixed, others still broken. | Split — close original with per-case 
summary, open focused new JIRA for the remaining case(s). See "Consider split 
candidacy" in the procedure above. |
+| **feature-request** | The reporter wants a different spec; the behaviour 
matches what's promised. JIRA is correctly typed as Improvement. | Re-typing 
not needed. Decide on `dev@` whether to accept the Improvement. |
+| **feature-request-disguised-as-bug** | Same as above but mis-typed as Bug in 
JIRA — the reporter framed an unmet wish as a defect. | Recommend re-typing Bug 
→ Improvement, then design discussion on `dev@`. |
+| **intended-and-documented** | Behaviour is correct *and* the docs clearly 
cover it. | Close as Not A Bug. The issue is the reporter not having found the 
docs; consider whether a docs cross-link or better discoverability would help. |
+
+Two pool-level observations are useful here:
+
+- **The Reopened pool over-represents the feature-request shapes.**
+  An issue that's been reopened is one someone re-examined and
+  consciously left open while pondering a spec change.
+- **The Open + EOL-affected-version pool over-represents the
+  bug-as-advertised shapes.** Issues nobody looked at while the
+  runtime/compiler under them was rewritten are more likely to be
+  silent fixes or genuine open bugs than wishlists.
+
+The distinction matters when choosing which JIRAs to work through
+in a re-triage pass: pick the pool that matches what you're
+hunting for.
+
 ### Drafting a useful comment or review
 
 Whether triaging an issue or a PR, the output is:

(groovy) branch master updated: AI readiness: consolidate human and AI docs

Reply via email to