This is an automated email from the ASF dual-hosted git repository.
paulk-asert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/groovy.git
The following commit(s) were added to refs/heads/master by this push:
new ca916452b7 AI readiness: consolidate human and AI docs
ca916452b7 is described below
commit ca916452b7c32f7090c4add0df90bd838ca70e82
Author: Paul King <[email protected]>
AuthorDate: Wed May 13 05:46:55 2026 +1000
AI readiness: consolidate human and AI docs
---
.agents/skills/groovy-reassess/SKILL.md | 154 ++++++++++++++++++---
.agents/skills/groovy-reproducer/SKILL.md | 219 ++++++++++++++++++++++++++----
ARCHITECTURE.md | 52 +++++++
CONTRIBUTING.md | 142 +++++++++++++++++--
4 files changed, 510 insertions(+), 57 deletions(-)
diff --git a/.agents/skills/groovy-reassess/SKILL.md
b/.agents/skills/groovy-reassess/SKILL.md
index cfbd0b8b58..dd86ca22d8 100644
--- a/.agents/skills/groovy-reassess/SKILL.md
+++ b/.agents/skills/groovy-reassess/SKILL.md
@@ -18,7 +18,7 @@
-->
---
name: groovy-reassess
-description: Running a bulk reassessment campaign over old GROOVY JIRA issues
— narrow JQL selection, per-issue reproducer extraction and execution via
`groovy-reproducer`, classification (`fixed-on-master` / `still-fails-same` /
`still-fails-different` / `cannot-run-*` / `intended-behaviour` /
`duplicate-of-resolved` / `timeout`), structured report and per-issue evidence
package, and a strict hand-back contract — no JIRA comments, no transitions, no
closures posted on behalf of the proj [...]
+description: Running a bulk reassessment campaign over old GROOVY JIRA issues
— narrow JQL selection (Reopened pool surfaces wishlists; Open+EOL surfaces
silent fixes and real bugs), per-issue reproducer extraction and execution via
`groovy-reproducer`, classification (`fixed-on-master` / `still-fails-same` /
`still-fails-different` / `cannot-run-*` / `intended-behaviour` /
`duplicate-of-resolved` / `timeout`), orthogonal `nature` analysis
(`bug-as-advertised` vs `feature-request-disguis [...]
license: Apache-2.0
compatibility: claude, codex, copilot, cursor, gemini, aider
metadata:
@@ -164,10 +164,24 @@ These are the recurring mistakes at the campaign level:
Selection drives the campaign's quality. Pick a *narrow*, *bounded*
slice; do not boil the ocean.
+**Pool choice matters.** The pool-selection heuristics — which
+status / affected-version slices over-represent which natures —
+live in [`CONTRIBUTING.md`'s "Pool-selection heuristics for
+re-triage
sweeps"](../../../CONTRIBUTING.md#pool-selection-heuristics-for-re-triage-sweeps).
+Choose the pool deliberately based on what the campaign is hunting
+for.
+
JQL building blocks come from
[`groovy-jira`](../groovy-jira/SKILL.md). Useful slices for
reassessment:
+- **Reopened (wishlist hunting):** `project = GROOVY AND status =
+ Reopened ORDER BY updated ASC`. Small, well-bounded pool.
+- **Open + EOL bugs (silent-fix hunting):** `project = GROOVY AND
+ status = Open AND issuetype = Bug AND affectedVersion in
+ ("1.5.6", "1.6.0", "1.6.5", "1.7.0", "1.8.0") ORDER BY created
+ ASC`. The version list should target releases before a known
+ major refactor of the relevant subsystem.
- **Age bucket × open:** `project = GROOVY AND statusCategory != Done
AND created < "2020/01/01" AND created >= "2018/01/01"
ORDER BY created ASC`. Buckets of two years are scannable; ten
@@ -176,9 +190,6 @@ reassessment:
!= Done AND component = "<X>" AND updated < -730d`. Pairs well
with an area you know — your fix-side strength shapes the
candidate set.
-- **Affected version end-of-life:** `project = GROOVY AND
- statusCategory != Done AND affectedVersion = "2.4.x"` — versions
- long out of support are high-yield for `fixed-on-master`.
- **No component (triage-then-reassess):** `project = GROOVY AND
statusCategory != Done AND component is EMPTY`. The reassessment
can also produce a `Component/s` suggestion (see
@@ -186,9 +197,9 @@ reassessment:
Cap the per-session set. A practical first pilot is 5–10 issues
spanning *different reproducer shapes* (one runnable script, one
-attachment, one prose-only, one `@Grab`, one comment-with-snippet)
-so the pipeline meets each shape early. Pilots beat the first
-hundred issues you'd naturally pick.
+attachment, one prose-only-but-precise, one `@Grab`,
+one comment-with-snippet) so the pipeline meets each shape early.
+Pilots beat the first hundred issues you'd naturally pick.
## Procedure
@@ -205,21 +216,90 @@ For each campaign session:
`original.<ext>`, `run.log`, `verdict.json`. This is *not* in
the Groovy checkout.
3. **For each issue, in order:**
- - Skip if its `verdict.json` already exists and is well-formed
- (resumability).
- - Read the JIRA issue and skim comments for an obvious
- "already fixed" / "won't fix" / "see GROOVY-XXXX" — early
- classifications save time.
- - Hand off to [`groovy-reproducer`](../groovy-reproducer/SKILL.md)
- for extraction, adaptation, running, and evidence capture.
- - Read the classification from `verdict.json`.
- - Reset the working tree before the next issue.
+
+ **a. Resumability check.** Skip if its `verdict.json` already
+ exists and is well-formed. The per-issue evidence files on
+ disk are the resumption point — in-memory campaign state is
+ not.
+
+ **b. Triage the issue per
+ [`CONTRIBUTING.md`'s "Triaging a JIRA
issue"](../../../CONTRIBUTING.md#triaging-a-jira-issue).**
+ That procedure is the canonical methodology — read the thread
+ (with historical baselines), search for duplicates and
+ same-family related work, attempt reproduction (including
+ per-reproducer execution when the thread contains multiple
+ distinct reproducers), locate the code, check JIRA fields,
+ search for documented workarounds, and assess split
+ candidacy. The bulk of the per-issue work is in that section.
+
+ **AI-specific recording** during this step — these are the
+ campaign's value-adds on top of the manual methodology:
+ - Historical baselines from comments → `verdict.json.cases[].history`
+ - Related-JIRA citations from the JQL scan → `verdict.json.notes`
+ - Per-reproducer outcomes (when multiple reproducers exist) →
+ `verdict.json.cases` (multi-case array)
+ - Workaround-search outcome and the close-path it implies →
+ `verdict.json.notes`
+ - Split-candidacy recommendation → `verdict.json.notes`
+
+ **c. Hand off to
+ [`groovy-reproducer`](../groovy-reproducer/SKILL.md)** for the
+ reproducer extraction, adaptation, running, and evidence
+ capture. This includes the optional cross-family probe when
+ the operation under test spans multiple types or operator
+ variants — see
+ [ARCHITECTURE.md "Operator
families"](../../../ARCHITECTURE.md#operator-families)
+ for the family taxonomies.
+
+ **d. Apply nature analysis.** Per
+ [`CONTRIBUTING.md`'s "Nature
analysis"](../../../CONTRIBUTING.md#nature-analysis-bug-as-advertised-vs-wouldnt-it-be-nice),
+ determine whether the issue is `bug-as-advertised`,
+ `feature-request-disguised-as-bug`, `intended-and-documented`,
+ etc. Record in `verdict.json.nature`. This is orthogonal to
+ `classification` (a still-reproducing wishlist and a
+ still-reproducing bug recommend different actions).
+
+ **e. Compose the final `verdict.json`.** Classification +
+ nature + cases (if multi-case) + cross-type-probe /
+ operator-variants-probe (if run) + notes with the close-path
+ recommendation. See
+ [`groovy-reproducer`](../groovy-reproducer/SKILL.md)'s
+ Evidence package section for the schema.
+
+ **f. Reset the working tree** before the next issue.
4. **After the loop, build the report** (see below). Do *not*
post anything.
5. **Hand back** to the human — branch (if any local Groovy
commits, e.g. adapted `@Test` files kept for follow-up),
scratch corpus path, report path.
+## Nature analysis (AI-recording layer)
+
+The nature taxonomy and the "bug-as-advertised vs wouldn't-it-be-
+nice" question live in
+[`CONTRIBUTING.md`'s "Nature analysis"
section](../../../CONTRIBUTING.md#nature-analysis-bug-as-advertised-vs-wouldnt-it-be-nice).
+That section is the canonical source; apply it per-issue during
+procedure step 3d above.
+
+This skill's contribution is to **record** the result in
+`verdict.json.nature` so the campaign-level aggregator can roll
+up nature classifications across the issue set (e.g. "of N
+issues reassessed, M classified as
+`feature-request-disguised-as-bug` — those rows recommend
+re-typing rather than fixing"). The allowed values:
+
+- `bug-as-advertised`
+- `bug-as-advertised-partial-fix` (paired with a split
+ recommendation in `notes`)
+- `feature-request` (correctly typed as Improvement; no re-type
+ needed)
+- `feature-request-disguised-as-bug` (Bug → Improvement re-type
+ recommendation)
+- `intended-and-documented`
+
+See the verdict-template `_field_help` for the schema. Apply
+exactly one value per verdict — `nature` is required.
+
## Classification taxonomy
The campaign uses the per-issue classifications produced by
@@ -282,6 +362,19 @@ The campaign produces:
for a real regression test).
- A short verbal summary: "Swept N issues. M `fixed-on-master`, K
`still-fails-same`. Headlines: GROOVY-A, GROOVY-B, …."
+- **Split recommendations** for any issue where the reassessment
+ surfaced multiple cases with distinct fates (some silently fixed,
+ some still failing) or distinct user-visible symptoms.
+ Format: "close GROOVY-X with per-case summary; open new JIRA
+ for GROOVY-X-residual-A (the remaining case)."
+- **New-JIRA candidates** surfaced by cross-family probes — when
+ probing a sibling type or operator variant uncovered a bug the
+ original report didn't contain, flag it as a candidate for its
+ own JIRA. The verdict notes should sketch the 4-line reproducer.
+- **Documentation candidates** — issues where the close path is
+ "Not A Bug + add docs" (workaround exists but undocumented; spec
+ gap; behaviour-by-design but surprising). The report should
+ list each docs deliverable with a one-line scope.
The campaign does **not**:
@@ -311,25 +404,48 @@ the project communicates.
Before declaring a campaign session complete:
+- [ ] **Pool was chosen deliberately** (Reopened vs Open+EOL) and
+ the choice matches the campaign's goal (spec-debate review vs
+ silent-fix hunting).
- [ ] Candidate set was bounded *before* the loop started; the
bound is recorded in the report.
- [ ] Per-issue evidence package on disk for every candidate, with
`verdict.json` and a non-empty `description.md`.
-- [ ] Every classification used is one of the taxonomy entries; no
- free-form labels.
+- [ ] Every `classification` used is one of the taxonomy entries;
+ every `nature` used is one of the nature values; no free-form
+ labels.
+- [ ] **`nature` populated** on every verdict — orthogonal to
+ classification.
+- [ ] **Historical baselines from the comment thread** are recorded
+ in `cases[].history` where applicable, so the headline
+ finding can be "unchanged since baseline X" rather than just
+ "current state is Y."
+- [ ] **Workaround search done** for every still-fails-* verdict
+ (`src/spec/doc/` + Groovydoc + JIRA comments). The
+ recommendation distinguishes "close as Not A Bug" from
+ "close + add docs."
+- [ ] **Related-JIRA scan** done for every verdict; same-family
+ issues cited in notes.
- [ ] `fixed-on-master` rows include the rev and JDK in the
evidence; the verdict isn't over-claimed.
- [ ] `still-fails-same` rows are surfaced at the top of the
report, not buried.
- [ ] `cannot-run-*` rows have a concrete reason in the evidence
(which dep failed, what was missing) — not "could not run."
+- [ ] **Split candidates flagged** for any issue with multi-case
+ mixed fates or distinct user-visible symptoms.
+- [ ] **Cross-family probe results recorded** in
+ `cross_type_probe` or `operator_variants_probe` when the
+ probe was run; any sibling-type bugs are flagged as new-JIRA
+ candidates.
- [ ] No JIRA mutation occurred. No PR was opened. No dev@ post
was sent.
- [ ] The report opens with a summary stanza a committer can scan
in 30 seconds.
- [ ] Working tree was clean at the end of the session.
- [ ] Hand-back artefact lists the scratch corpus path, the report
- path, any local commits worth keeping, and the recommended
+ path, any local commits worth keeping, any split or
+ documentation candidates surfaced, and the recommended
publication path (typically a single dev@ thread).
## References
diff --git a/.agents/skills/groovy-reproducer/SKILL.md
b/.agents/skills/groovy-reproducer/SKILL.md
index d45b89de89..388232454d 100644
--- a/.agents/skills/groovy-reproducer/SKILL.md
+++ b/.agents/skills/groovy-reproducer/SKILL.md
@@ -107,11 +107,16 @@ reproducers:
stop. The reporter's specific code is what makes a reproduction
trustworthy; an agent-written stand-in is a different exercise
(and a different verdict).
-2. **Skipping the comment thread.** Reporters frequently post a
- simplified reproducer in a comment after the initial description.
- Reading only the description misses it. Inventory every code
- block in the description *and* every comment *and* every
- attachment before picking a candidate.
+2. **Skipping the comment thread, or only running the headline
+ reproducer.** Reporters frequently post a simplified reproducer
+ in a comment after the initial description, and may follow up
+ with additional cases that exercise different symptoms of the
+ same root cause. Inventory every code block in the description
+ *and* every comment *and* every attachment, and when distinct
+ reproducers exist, **run each and record per-reproducer
+ outcomes** — not just the headline one. The `cases` array in
+ `verdict.json` (see Evidence package below) carries
+ per-reproducer state for multi-case issues.
3. **Treating `@Test` adaptation as equivalent to a script run.**
Groovy scripts and class methods have different scoping (script
bindings vs. fields, implicit `main`, `def` vs. typed locals).
@@ -172,6 +177,26 @@ reproducers:
`~/.groovy/grapes/` (see GROOVY-12005). For a campaign,
consider a per-sweep Grape root via `-Dgrape.root=<scratch>` so
the user's everyday cache stays clean.
+14. **Treacherous substring matching in verification logic.** Same
+ trap covered in
+ [`CONTRIBUTING.md`'s "Test-writing
pitfalls"](../../../CONTRIBUTING.md#test-writing-pitfalls)
+ — applies equally to reproduction verification scripts.
+ Substring matching near common prefixes (`xs` / `xsi`,
+ `groovy` / `groovy-`) silently produces false positives.
+ Prefer anchored regex or parsed-tree inspection. The
+ "verify identifiers" discipline applies to the verification
+ logic itself, not just the code under test — almost-shipped
+ false `fixed-on-master` results have hit this trap.
+15. **Reproducer-stale-due-to-API-evolution treated as a bug.**
+ Old reproducers may use classes that have moved or been
+ removed. A `ClassNotFoundException` on import isn't the
+ reporter's bug — it's mechanical adaptation territory. The
+ canonical mapping of class moves lives in the release notes
+ (Groovy 3.0 split-packages section is the largest); see
+ [ARCHITECTURE.md "Operator
families"](../../../ARCHITECTURE.md#operator-families)
+ for the project-side context. Add the new import per that
+ mapping; don't classify as `still-fails-different` or
+ `cannot-run-environment`.
## Reproducer shape taxonomy
@@ -203,8 +228,23 @@ Recipe: classify `cannot-run-extraction`. The stack trace
is a *hint
about the area*, not a reproducer. Don't construct code to "make
that stack trace appear"; that's fabrication.
-**E. Prose-only** — natural-language description, no code. Recipe:
-`cannot-run-extraction`. Don't write code from prose.
+**E-vague. Prose-only, no precise testable claim** —
+natural-language description without a specifiable behaviour
+("ConfigObject sometimes behaves weirdly"). Recipe:
+`cannot-run-extraction`. Don't write code from vague prose.
+
+**E-precise. Prose-only, but the prose IS a specifiable claim** —
+the description contains an algebraic / specifiable claim with no
+verbatim code but enough precision to construct a faithful test
+(e.g. *"`x?.y?.z` returns null on Maps but throws on POGOs"*).
+Recipe: construct a reproducer that tests **exactly that claim**
+— instantiate the explicit assertion, do not interpolate beyond
+it. Classify normally per the outcome. The distinction from
+fabrication: E-precise is *instantiation of an explicit claim*
+(the prose IS the spec); fabrication is *guessing at inputs,
+structure, or APIs the reporter didn't specify*. If the
+construction would require either, classify `cannot-run-extraction`
+and stop.
**F. Attachment** — `.groovy`, `.java`, `.zip` (project), `.txt`
(log), `.gz` (heap dump, etc.). For `.groovy` / `.java` files,
@@ -246,11 +286,36 @@ For each reproducer:
(guessing a type, inventing a missing variable), stop and
classify `cannot-run-extraction` with a note about what was
missing.
- - Shapes D, E: classify `cannot-run-extraction`; don't adapt.
+ - Shape D: classify `cannot-run-extraction`; don't adapt.
+ - Shape E-vague: classify `cannot-run-extraction`; don't write
+ code from prose without a precise claim.
+ - Shape E-precise: construct a reproducer that tests **only**
+ the explicit claim the prose makes. Cite the prose verbatim
+ in the comment header so the construction is auditable.
- Shape F: per inner shape; for project zips, classify
`needs-separate-workspace`.
- Shape G: copy verbatim; flag for Grape-aware running.
- Shape H: classify `needs-separate-workspace`.
+
+ **API-evolution adaptation.** Old reproducers may not compile
+ on modern Groovy because classes moved or were removed. This is
+ mechanical adaptation — *not* fabrication — when the move is
+ documented in the release notes. The Groovy 3.0 split-packages
+ refactor is the largest such reshuffle; see
+ [ARCHITECTURE.md "Operator
families"](../../../ARCHITECTURE.md#operator-families)
+ for the project-side context (and the release-notes link there
+ for the canonical mapping).
+
+ When you make an adaptation under this rule:
+ - The body of the reproducer stays unchanged — only imports /
+ package references shift.
+ - Cite the release-notes section in `verdict.json.notes` so the
+ adaptation is auditable.
+ - If the adaptation requires *behavioural* changes (not just
+ imports) — e.g. a method signature changed — that's a
+ different classification: the reporter's claim might be
+ `still-fails-different` (if the new API behaves differently)
+ or you may need to escalate to `needs-info`.
5. **Build the current Groovy distribution** if the reproducer is a
script that needs the produced `groovy` binary. For `@Test`-shape
reproducers, the Gradle test invocation handles the build.
@@ -262,10 +327,29 @@ For each reproducer:
substring" is `same-failure`. "Fails with something different" is
`different-failure`. "Doesn't fail" is `passes`. "Hangs past
timeout" is `timeout`. "Errors before exercising the path" is
- `cannot-run-*`.
-8. **Record the evidence package** before doing anything else.
-9. **Reset the working tree** if you adapted as a `@Test` (the
- added file must not leak to the next issue).
+ `cannot-run-*`. For multi-case reproducers (a list of
+ assertions, a Shape-E-precise probe across backends), record
+ per-case state in `verdict.json.cases` so partial-fix patterns
+ are queryable — see Evidence package below.
+8. **Scan the JIRA's comment thread for historical baselines.** A
+ committer's prior "I just ran this on version X, here's what I
+ got" comment is a baseline worth comparing against, not just
+ the original report's claim. If found, record each baseline in
+ `verdict.json.cases[].history` (year, status, source). The
+ headline finding may be "the state hasn't changed since this
+ committer's baseline" rather than "the state is X today."
+9. **(Optional) Cross-family probe.** When the reproducer
+ exercises a behaviour defined for multiple backing types or via
+ multiple operator variants, run a quick probe across the
+ family — see *Cross-family probes* below. The probe is cheap
+ and consistently surfaces signal beyond the reporter's
+ framing (a project-wide spec gap, an additional bug in a
+ sibling type, or confirmation that the asymmetry spans the
+ whole operator family). Record results in
+ `verdict.json.cross_type_probe` or `.operator_variants_probe`.
+10. **Record the evidence package** before doing anything else.
+11. **Reset the working tree** if you adapted as a `@Test` (the
+ added file must not leak to the next issue).
## Run posture
@@ -275,8 +359,9 @@ For each reproducer:
failures get the `cannot-run-dependency` classification; they are
not "fixed."
- **Filesystem:** scratch directory per issue, under
- `~/work/groovy-reassessment/<KEY>/` (or wherever the campaign
- layout puts it — see [`groovy-reassess`](../groovy-reassess/SKILL.md)).
+ `~/work/groovy-reassess/<campaign-id>/<JIRA-KEY>/` (or wherever
+ the campaign layout puts it — see
+ [`groovy-reassess`](../groovy-reassess/SKILL.md)).
Don't write under the Groovy checkout.
- **Working tree:** clean between reproducers. The
added-and-then-removed `@Test` is the most common leak source.
@@ -287,6 +372,52 @@ For each reproducer:
matters (`passes`, `fixed-on-master`), retry on the
originally-affected JDK via Gradle toolchains where reasonable.
+## Cross-family probes (AI-tooling pattern)
+
+When the reproducer exercises a behaviour defined for **multiple
+backing types** or **multiple operator variants** in the language,
+probe the others. The pattern is cheap (~50 line script per
+family) and consistently surfaces signal beyond the reporter's
+framing.
+
+The **family taxonomies** (type families like `List`/`Object[]`/
+primitive arrays/`String`; operator-variant families like the
+three safe-navigation variants) live in
+[ARCHITECTURE.md "Operator
families"](../../../ARCHITECTURE.md#operator-families).
+That section is the canonical reference for what to probe across
+and why the family members behave as they do (dispatch paths,
+known asymmetries). Apply it during procedure step 9.
+
+This skill's contribution is the **AI-tooling pattern for
+*running* the probe**: a small Groovy script that exercises each
+family member, emits a comparison table, and gets saved alongside
+`reproducer.<ext>` as `cross-type-probe.groovy` or
+`operator-variants-probe.groovy`.
+
+Probe template structure:
+
+```groovy
+def probes = [
+ 'Member A' : { -> /* construct backend A, exercise the expression */ },
+ 'Member B' : { -> /* same expression on backend B */ },
+ // ...
+]
+probes.each { name, body ->
+ def outcome
+ try { outcome = body() } catch (Throwable t) { outcome = "THREW:
${t.class.simpleName}" }
+ println String.format("%-20s | %s", name, outcome)
+}
+```
+
+Record results in `verdict.json.cross_type_probe` /
+`.operator_variants_probe` (see Evidence package below).
+
+**Sanity check:** if the probe surfaces a *new* bug in a sibling
+type that the original report didn't mention, that often
+warrants its own JIRA (the verdict note should flag the new-JIRA
+candidate). The original issue's verdict still reflects the
+original report; the sibling-type bug is a separate finding.
+
## Evidence package
For each reproducer run, persist:
@@ -298,19 +429,53 @@ For each reproducer run, persist:
- `original.<ext>` — the literal source from JIRA, untouched, when
extracted.
- `run.log` — stdout + stderr from the run, with the exact command
- on the first line.
-- `verdict.json` — `{ "key": "GROOVY-NNNNN", "shape": "<A|B|…>",
- "classification": "<one of: same-failure | different-failure |
- passes | cannot-run-extraction | cannot-run-environment |
- cannot-run-dependency | timeout | needs-separate-workspace>",
- "rev": "<short-sha>", "jdk": "<vendor+version>",
- "command": "<verbatim>", "runtime-ms": <int>,
- "exit-code": <int>, "matched-original-failure": <bool>,
- "notes": "<short>" }`.
-
-This package is what a committer needs to trust the verdict, and it
-is what [`groovy-reassess`](../groovy-reassess/SKILL.md) feeds into
-its report.
+ on the first line plus `rev`, `jdk`, started/ended timestamps.
+- `cross-type-probe.<groovy|log>` and/or
+ `operator-variants-probe.<groovy|log>` — optional, when a
+ cross-family probe was run (see *Cross-family probes* above).
+- `cross-type-probe-findings.md` — optional, when the probe
+ surfaced project-wide signal worth surfacing separately.
+- `verdict.json` — the structured classification. Schema:
+
+```json
+{
+ "key": "GROOVY-NNNNN",
+ "shape": "A | B | C | D | E-vague | E-precise | F | G | H",
+ "classification": "fixed-on-master | still-fails-same |
still-fails-different | cannot-run-extraction | cannot-run-environment |
cannot-run-dependency | timeout | intended-behaviour | duplicate-of-resolved |
needs-separate-workspace",
+ "nature": "bug-as-advertised | bug-as-advertised-partial-fix |
feature-request | feature-request-disguised-as-bug | intended-and-documented",
+ "rev": "<short-sha>",
+ "jdk": "<vendor + version>",
+ "command": "<verbatim>",
+ "runtime_ms": <int or null>,
+ "exit_code": <int>,
+ "matched_original_failure": <bool>,
+ "cases": [ // optional; multi-case
reproducers only
+ {
+ "expr": "<expression / sub-case>",
+ "expected": "<expected outcome>",
+ "actual_master": "<observed on master>",
+ "match_on_master": <bool>,
+ "history": [{"year": <int>, "status": "...", "source": "..."}],
+ "note": "<short>"
+ }
+ ],
+ "cases_summary": "<one-line roll-up>", // optional
+ "cross_type_probe": { "file": "...", "log": "...", "findings": "...",
"summary": "..." }, // optional
+ "operator_variants_probe": { "file": "...", "log": "...", "summary": "..."
}, // optional
+ "notes": "<long-form analysis and recommendation>"
+}
+```
+
+Keys use **snake_case** (`runtime_ms`, not `runtime-ms`) so
+`jq` queries don't need quoting. The `nature` field is
+orthogonal to `classification` and answers the question "is
+this not operating as advertised, or is this wouldn't-it-be-nice?"
+— see [`groovy-reassess`](../groovy-reassess/SKILL.md) for how
+the campaign uses it.
+
+This package is what a committer needs to trust the verdict, and
+it is what [`groovy-reassess`](../groovy-reassess/SKILL.md) feeds
+into its report.
## Validation checklist
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index b2c78e83a6..216e303c4c 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -228,6 +228,58 @@ above. Each bites contributors quickly if missed:
`INSTRUCTION_SELECTION` or later. See the
[Compilation pipeline](#compilation-pipeline) phase table.
+## Operator families
+
+Several Groovy operators and expression forms are defined for
+**multiple backing types**, and behaviour across the members of a
+family is sometimes inconsistent. When investigating a bug
+reported for one type, probing the same expression across siblings
+often surfaces nuance the reporter missed — a hidden bug in a
+sibling type, confirmation that an asymmetry spans the whole
+family, or a project-wide spec gap that wasn't visible from a
+single-type report.
+
+### Type families
+
+| Family | Members | Notes |
+|---|---|---|
+| **Range / index operators** (`agg[idx]`, `agg[range]`) | `List` / `Object[]`
/ primitive arrays (`int[]`, `long[]`, …) / `String` / `CharSequence` |
Different exception classes (`IndexOutOfBoundsException` vs
`ArrayIndexOutOfBoundsException` vs `StringIndexOutOfBoundsException`).
Negative-endpoint and out-of-range-negative semantics have historically
diverged across types — see GROOVY-3974 for a concrete example surfaced by
cross-type probing. |
+| **GPath expressions** (`x.y.z`, `x?.y`, `x*.y`) | In-memory (Map, List,
nested combinations) / JSON (`JsonSlurper`) / XML (`XmlSlurper` / `XmlParser`)
/ POGO / Java POJO / SQL result sets (`groovy.sql.Sql`) | XML has special
handling for attributes (`@attr` syntax) and returns empty `NodeChild`
collections on missing children rather than null. Map/JSON return null on
missing keys. POGOs and POJOs throw `MissingPropertyException` for missing
properties — the asymmetry is by-design (each [...]
+| **Numeric coercion** (`+`, `-`, `*`, `/`, comparison) | `int` / `long` /
`BigInteger` / `BigDecimal` / `double` / `Float` / `Long` (boxed) | Coercion
rules vary; the result type of `int + BigDecimal` may surprise. |
+
+### Operator-variant families
+
+Some operators have **multiple syntactic variants** that share a
+family but dispatch differently:
+
+| Family | Variants | Dispatch notes |
+|---|---|---|
+| **Safe navigation** | `?.` (SAFE_DOT) / `??.` (SAFE_CHAIN_DOT — shorthand
for chained `?.`) / `?[..]` (SAFE_INDEX) | `?.` and `??.` call
`getProperty(String)`. `?[..]` calls `getAt(Object)`, but on POGOs that routes
through `getProperty` for missing keys, so the variants behave identically for
POGO missing-property access. |
+| **Spread** | `*.` / `*[..]` / `*:` | Different unpacking semantics across
iteration / indexing / map-merge. |
+| **Equality / identity** | `==` / `.equals()` / `is` | `==` is `equals`-based
in Groovy (not reference-equality as in Java); `is` is Java's `==` (reference).
|
+| **Coercion** | `as` / `asType()` / constructor + `from` | Different
conversion paths; `as` is statically-resolvable, `asType` is dynamic. |
+| **Range** | `..` / `..<` / `..>` | Endpoint inclusion / direction
differences. |
+| **Elvis / null-coalesce** | `?:` and elaborations | Truthy-vs-null
differences in the left-hand side. |
+
+### Why this matters for investigation
+
+For an investigation of a bug in one family member, probing across
+siblings is a recurring technique. A ~50-line probe script
+(constructing each backend, running the same expression, recording
+outcomes in a table) is usually enough to:
+
+- confirm whether an asymmetry the reporter found spans the family
+ or is type-specific;
+- surface a hidden bug in a sibling type the reporter didn't test
+ (and which may warrant its own JIRA);
+- reveal that what looks like a bug is actually consistent
+ documented behaviour with a documented or implicit workaround in
+ a sibling form.
+
+See `.agents/skills/groovy-reproducer/SKILL.md`'s "Cross-family
+probes" section for the AI-tooling pattern. The probe approach is
+equally useful when investigating by hand.
+
## Generated code
The following are produced by the build and regenerated on every
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 90256ca040..4495071e91 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -318,6 +318,18 @@ platform issues but are actually project-specific:
Without forwarding, the gated test always skips even with
`-Djunit.network=true` on the Gradle CLI.
+- **Treacherous substring matching in verification logic.** When
+ scripting verification (e.g. checking whether some token
+ survived a transformation), plain `.contains()` can silently
+ produce false positives near common prefixes —
+ `output.contains('xmlns:xs')` matches `xmlns:xsi` as a prefix.
+ Prefer anchored regex (`output =~ /xmlns:xs="/`) or parsed-tree
+ inspection (`new XmlSlurper().parseText(output)`) over substring
+ matching. The trap also bites verification logic written for
+ reassessments and triage probes, not just tests; the principle
+ applies anywhere you're matching tokens with shared prefixes
+ (`xs` / `xsi`, `groovy` / `groovy-`).
+
### For agents working on tests
The
[`.agents/skills/groovy-tests/SKILL.md`](.agents/skills/groovy-tests/SKILL.md)
@@ -506,6 +518,28 @@ project = GROOVY AND statusCategory = Done AND resolution
is EMPTY
project = GROOVY AND parent = GROOVY-<NNNN>
```
+#### Pool-selection heuristics for re-triage sweeps
+
+When picking a pool of issues to work through (whether by hand or
+in a tool-assisted sweep), the pool's *status mix* shapes what
+you'll find:
+
+- **`status = Reopened`** — small pool, over-represents
+ *feature-requests-disguised-as-bugs*: issues someone re-examined
+ and left open while pondering a spec change. Useful when the
+ goal is *"what spec debates is the project sitting on?"*.
+- **`status = Open AND affectedVersion in ("<EOL-versions>")`** —
+ larger pool, over-represents *silent fixes*, *real open bugs*,
+ and *partial fixes*. Useful when the goal is *"find what's been
+ silently resolved by later refactoring."* Target affected
+ versions that pre-date a known major refactor of the relevant
+ subsystem.
+
+The two pools answer different questions; choose deliberately
+rather than by default-sort. See *Nature analysis* under
+[Triaging issues and pull requests](#triaging-issues-and-pull-requests)
+for what the populations look like in practice.
+
### Linking JIRA to commits and pull requests
Every commit or pull request that fixes a JIRA-tracked issue
@@ -559,9 +593,17 @@ For each issue:
reporter included a stack trace, the top non-JDK frame usually
points at the area of the code involved.
-2. **Search for duplicates and related work** before writing a long
- analysis. A one-line "duplicate of GROOVY-XXXX, fixed in 4.0.Y"
- beats a 500-word root-cause summary of a known bug.
+ Also scan for **historical baselines**: prior committer comments
+ that say "I just ran this on version X, here's what I got." These
+ are checkpoints the current triage can compare against — the
+ headline finding for an old issue may be "state unchanged since
+ the 2013 committer baseline" rather than the current state alone.
+
+2. **Search for duplicates and same-family related work** before
+ writing a long analysis. A one-line "duplicate of GROOVY-XXXX,
+ fixed in 4.0.Y" beats a 500-word root-cause summary of a known
+ bug. A "GROOVY-YYYY is in the same neighbourhood" pointer helps
+ the committer decide on batch-or-sequential treatment.
```
git log --grep='GROOVY-<NNNN>' # commits referencing the JIRA
@@ -570,7 +612,8 @@ For each issue:
Plus a JQL text search — see the
[Duplicate hunting by error string](#searching-with-jql) recipe
- above.
+ above. The same JQL with a topic-keyword surfaces same-family
+ open issues: `project = GROOVY AND text ~ "<topic>"`.
3. **Attempt reproduction on `master`.** Drop the reporter's script
into a temp file and run it against a local build, or paste their
@@ -581,6 +624,21 @@ For each issue:
that's now passing on `master` is meaningful information; a
missing reproduction is speculation.
+ When the issue thread contains **multiple distinct reproducers**
+ (the description plus a follow-up comment that demonstrates a
+ different symptom of the same root cause), run each. They may
+ have different fates: one silently fixed, another still broken
+ — that's a *split-candidate* signal (see step 7 below).
+
+ When the operator or expression under test spans **multiple
+ backing types** (range/index on `List` / `Object[]` / `int[]` /
+ `String`; GPath on Map / JSON / XML / POGO / POJO) or has
+ **multiple operator variants** (safe-navigation `?.` / `??.` /
+ `?[..]`), probing the same expression across the family is
+ often cheap and surfaces signal the reporter missed. See the
+ family taxonomies in [`ARCHITECTURE.md`'s "Operator families"
+ section](ARCHITECTURE.md#operator-families).
+
4. **Locate the code, lightly.** If the failure reaches the
runtime/compiler, identify the package or class the stack flows
through — that's the "where" to point a fix at. Don't go deeper
@@ -592,15 +650,47 @@ For each issue:
[Fields and who sets them](#fields-and-who-sets-them) and
[Components](#components).
-6. **Draft a comment** with: the state of the reproduction
+6. **Search for documented workarounds.** Before recommending a
+ closure path, check three places:
+
+ - `src/spec/doc/` (user-facing docs) — `grep -rn '<topic>' src/spec/doc/`
+ - Groovydoc on the relevant source classes — `grep -B2 -A8` on
+ the source file
+ - JIRA comments — keywords like `workaround`, `prefix`,
+ `coerce`, `use ... instead`
+
+ The outcome shapes the recommended close path:
+
+ - **Workaround documented user-facing** → close as "Not A Bug"
+ or "Won't Fix"; cite the doc.
+ - **Workaround exists but is undocumented user-facing** → close
+ as "Not A Bug **+ add docs**". The documentation deliverable
+ is the actionable artefact; closing without it leaves the
+ surprise intact for the next user.
+ - **No workaround** → keep open OR re-type as Improvement (per
+ nature analysis, below).
+
+7. **Consider split candidacy.** When the issue bundles **multiple
+ reproducers with mixed fates** (some silently fixed, some still
+ broken) or **multiple user-visible symptoms with independently-
+ fixable causes**, recommend a split: close the original with a
+ per-case summary, suggest a focused new JIRA for the remaining
+ unfixed case(s) with the targeted reproducer. Old multi-case
+ JIRAs often resolve partially over time; the constructive close
+ path is a per-case status update plus carry-over.
+
+8. **Draft a comment** with: the state of the reproduction
(passed / failed / could not run, with revision + JDK), the
duplicate-search result, the likely area of the code, suggested
- missing fields, and a recommended next action ("needs a minimal
- reproducer," "looks fixed on master — propose closing as Cannot
- Reproduce after a second pair of eyes," "appears to need a fix
- in `<area>`"). Factual, helpful, specific.
-
-7. **Don't transition the issue.** Even when the recommendation is
+ missing fields, the workaround-search outcome, and a recommended
+ next action ("needs a minimal reproducer," "looks fixed on
+ master — propose closing as Cannot Reproduce after a second pair
+ of eyes," "appears intended-behaviour-but-undocumented; propose
+ closing + docs PR," "appears to need a fix in `<area>`,"
+ "consider splitting — A is fixed, B remains"). Factual, helpful,
+ specific.
+
+9. **Don't transition the issue.** Even when the recommendation is
clear, leave the workflow state to a committer.
### Triaging a pull request
@@ -649,6 +739,36 @@ For each PR:
reformat, then nits. Use file-path:line references so committers
can jump to each finding.
+### Nature analysis: bug-as-advertised vs wouldn't-it-be-nice
+
+Before recommending a close path, ask: **is this not operating as
+advertised, or is this 'wouldn't it be nice if'?** The answer
+shapes the action differently from what the reproducer's outcome
+alone suggests. Two issues can both reproduce verbatim and need
+totally different closures.
+
+| Nature | Meaning | Recommended action |
+|---|---|---|
+| **bug-as-advertised** | A documented or implicit promise isn't being kept.
The code does not deliver what its signature, Groovydoc, or spec says. | Fix
it. The reproducer is the regression-test target. |
+| **bug-as-advertised, partial fix** | Originally multi-case; some cases
silently fixed, others still broken. | Split — close original with per-case
summary, open focused new JIRA for the remaining case(s). See "Consider split
candidacy" in the procedure above. |
+| **feature-request** | The reporter wants a different spec; the behaviour
matches what's promised. JIRA is correctly typed as Improvement. | Re-typing
not needed. Decide on `dev@` whether to accept the Improvement. |
+| **feature-request-disguised-as-bug** | Same as above but mis-typed as Bug in
JIRA — the reporter framed an unmet wish as a defect. | Recommend re-typing Bug
→ Improvement, then design discussion on `dev@`. |
+| **intended-and-documented** | Behaviour is correct *and* the docs clearly
cover it. | Close as Not A Bug. The issue is the reporter not having found the
docs; consider whether a docs cross-link or better discoverability would help. |
+
+Two pool-level observations are useful here:
+
+- **The Reopened pool over-represents the feature-request shapes.**
+ An issue that's been reopened is one someone re-examined and
+ consciously left open while pondering a spec change.
+- **The Open + EOL-affected-version pool over-represents the
+ bug-as-advertised shapes.** Issues nobody looked at while the
+ runtime/compiler under them was rewritten are more likely to be
+ silent fixes or genuine open bugs than wishlists.
+
+The distinction matters when choosing which JIRAs to work through
+in a re-triage pass: pick the pool that matches what you're
+hunting for.
+
### Drafting a useful comment or review
Whether triaging an issue or a PR, the output is: