andrewmusselman opened a new issue, #25:
URL: https://github.com/apache/tooling-agents/issues/25

   # Audit produces zero Critical findings when `severityThreshold` is set; 
move filtering to consolidate phase
   
   ## Summary
   
   When the orchestrator runs with `severityThreshold=HIGH`, the audit phase 
(Opus) emits **zero Critical findings** across every section, even for findings 
whose own descriptions explicitly classify themselves as Type B/C/D control 
gaps — which the audit rubric defines as Critical. Findings that should be 
Critical instead land at HIGH.
   
   Two recent Apache Airflow audits illustrate:
   
   | Run | Raw findings | HIGH | CRITICAL |
   |---|---|---|---|
   | `airflow-core` @ 7ca4c75 | 25 | 25 | 0 |
   | `task-sdk` @ 7ca4c75 | 46 | 46 | 0 |
   
   Across **45 sections** that produced findings, the audit emitted Finding IDs 
of the form `ASVS-XXX-HIGH-NNN` exclusively — never `CRIT`, never `MED`, never 
`LOW`. The model is locked onto HIGH.
   
   ## Evidence
   
   CouchDB inspection of the per-section reports (raw audit output, before 
consolidate touches anything) confirms this is an **audit-phase** issue, not a 
consolidate-phase downgrade. Examples where the model's own description names a 
Type B/C gap but the finding is marked HIGH:
   
   - `airflow-core` finding on multi-team authorization: *"the control EXISTS 
(`is_authorized_pool`) but is NOT CALLED when the teams set is empty"* — 
textbook Type B, marked HIGH.
   - `task-sdk` finding on deserialization allow-list: *"Type B gap where a 
security control EXISTS (allow-list with regex matching) but is NOT correctly 
applied"* — Type B, marked HIGH.
   - `task-sdk` finding on auth backend fallback: *"Type C gap where the 
control is called (server-side authorization checks token scope) but the result 
is ignored"* — Type C, marked HIGH.
   
   ## Root Cause
   
   The `if severity_threshold:` block in `asvs_audit.py` (~line 785) injects 
this text into the Opus system prompt:
   
   ```
   ## Severity Threshold
   Only report findings at these severity levels: CRITICAL, HIGH.
   Do not include findings below HIGH severity.
   ```
   
   The word "HIGH" appears prominently (twice in two consecutive lines, plus as 
the threshold value itself). The model treats this as a strong anchor — HIGH 
becomes the operating zone — and the gap-type rules elsewhere in the prompt 
that say "Type B/C/D = CRITICAL" lose to that anchoring.
   
   This was confirmed by ruling out alternatives:
   - Not a consolidate-phase downgrade: raw per-section reports already have 0 
Critical before consolidate runs.
   - Not a parser bug: 22/22 and 23/23 reports with findings parsed cleanly via 
the `ASVS-XXX-{SEV}-NNN` format strategy. The model is producing structured 
Finding IDs — just only ever with the HIGH token.
   - Not a redaction artifact: counts are 0 in both private 
(`tooling-runbooks`) and public (`tooling-agents`) consolidated.md.
   
   ## Proposed Fix: Move severity threshold from audit to consolidate
   
   **Rationale.** Audit (Opus) calls are the expensive part of the pipeline. 
Their output should be maximally reusable across different rendering choices. 
Pre-filtering by severity at audit time:
   
   - causes the prompt-anchoring bug above,
   - locks the audit cache to one threshold value (re-running with a different 
threshold means re-auditing all sections), and
   - couples audit-time computation to consumer-side display choices that don't 
need to be coupled.
   
   Filtering at consolidate time:
   
   - removes the anchoring problem entirely (audit no longer sees the threshold 
word),
   - makes audit results threshold-independent (re-render at any threshold 
without re-auditing — large speedup for iteration), and
   - centralizes display-policy decisions where they belong.
   
   The CouchDB refactor was the change that made this design viable — 
per-section reports are now cheap to store fully, so we no longer need 
audit-time pre-filtering as a noise-reduction trick.
   
   ## Implementation
   
   ### 1. `asvs_audit.py` — remove threshold prompt block
   
   Delete lines ~785–790:
   
   ```python
   if severity_threshold:
       severity_levels = {"CRITICAL": 4, "HIGH": 3, "MEDIUM": 2, "LOW": 1}
       threshold_val = severity_levels.get(severity_threshold.upper(), 0)
       if threshold_val > 0:
           included = [k for k, v in severity_levels.items() if v >= 
threshold_val]
           analysis_system_prompt += f"\n## Severity Threshold\n..."
   ```
   
   The `severityThreshold` input parameter can stay declared on the agent for 
orchestrator interface compatibility but becomes unused inside audit.
   
   ### 2. `asvs_consolidate.py` — apply filter after Phase 4 merge
   
   After the existing `all_findings.sort(...)` (around line 957), before global 
ID assignment:
   
   ```python
   if severity_threshold:
       severity_levels = {"CRITICAL": 4, "HIGH": 3, "MEDIUM": 2, "LOW": 1, 
"INFORMATIONAL": 0}
       threshold_val = severity_levels.get(severity_threshold.upper(), 0)
       pre_filter = len(all_findings)
       all_findings = [
           f for f in all_findings
           if severity_levels.get(f.get("severity", "Informational").upper(), 
0) >= threshold_val
       ]
       print(f"Severity filter ({severity_threshold} and above): {pre_filter} → 
{len(all_findings)}")
   ```
   
   The metadata table in consolidated.md already displays `Severity Threshold` 
correctly — no template change needed.
   
   ### 3. Gap-type rubric improvements
   
   Independent of this issue, the audit prompt was also strengthened in this 
branch with concrete Type B/C/D examples and a final self-check pass. Keep 
those — they're defense-in-depth and improve audit quality regardless of where 
the threshold lives. The "leave it as HIGH" guard in the self-check pass should 
be reworded to "keep whatever severity you assigned during initial drafting" to 
allow legitimately-Critical Type A findings (e.g., no auth on an admin 
endpoint) to remain Critical.
   
   ## Validation Plan
   
   1. Re-run one Airflow module with `clearCache=true` and 
`severityThreshold=HIGH`.
   2. Run `inspect_audit_findings.py` against the resulting CouchDB namespace — 
Critical findings should now appear in sections involving Type B/C/D gaps 
(deserialization bypass, encryption-disabled fallback, authorization bypass via 
parse failure, etc.).
   3. Verify `consolidated.md` filters down to HIGH+ when threshold set.
   4. Verify `consolidated.md` includes all severities when threshold not set.
   5. Compare consolidated counts before and after — expect Critical count to 
be non-zero on at least one of the two modules.
   
   ## Side Benefits
   
   - Per-section CouchDB reports become full audit data, reusable across any 
rendering.
   - Re-running consolidate at a different threshold is now cheap — no Opus 
calls, just Sonnet for synthesis (cached on input hash).
   - Removes a whole class of prompt-anchoring bug (any future "X Threshold" 
instructions for level/scope/severity that anchor on prominent words).
   
   ## Related Follow-ups (separate issues)
   
   - **Audit cache invalidation:** the cache key `f"batch-{i}"` doesn't 
incorporate prompt content, so prompt edits don't bust the cache. Adding 
`hashlib.sha256(analysis_system_prompt.encode()).hexdigest()[:8]` to the key 
would fix it.
   - **Empty-bundle stubs in CouchDB:** bundles that find zero matching files 
currently write `Error: No files found in namespaces ...` stubs to the reports 
namespace. One was observed at `xml_parsing/1.5.1.md` in the task-sdk run. 
Bundle agent should log-and-return without writing a CouchDB key for empty 
results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to