andrewmusselman opened a new issue, #28:
URL: https://github.com/apache/tooling-agents/issues/28

   ## Summary
   
   When the bundle agent's relevance/file-filter step produces zero matching 
files for a section, it writes an error stub to the reports namespace instead 
of returning silently. Consolidate Phase 1 reads these stubs as if they were 
findings, prepending the error string to the consolidated.md output.
   
   ## Symptom
   
   Inspection of the task-sdk run namespace:
   
   ```
   empty/stub:                 1
     - xml_parsing/1.5.1.md
   ```
   
   The file's contents start with:
   
   ```
   Error: No files found in namespaces ['files:apache/airflow/task-sdk']
   ```
   
   This stub then ends up at the head of consolidate's input, producing this in 
the early consolidated output (before our other fixes scrubbed it):
   
   ```
   Error: No files found in namespaces ['files:apache/airflow/task-sdk']
   
   ---
   
   # Security Audit Report: ASVS 3.2.1
   ...
   ```
   
   Cosmetic in impact — doesn't break the audit — but messy and confusing when 
debugging.
   
   ## Cause
   
   In `asvs_bundle.py`, when the file-filter step returns an empty list, the 
agent currently constructs a JSON envelope with the error message and emits it 
as a per-section report. The orchestrator's `store_one` then dutifully writes 
it to CouchDB under the section's key.
   
   ## Proposed Fix
   
   When `all_files` is empty after filtering:
   
   1. Log the condition with the section ID and the namespaces queried.
   2. Return a result object indicating "skipped: no files in scope" — distinct 
from "audited and found nothing".
   3. Do **not** include the section in the per-section JSON envelope passed 
back to the orchestrator.
   
   The orchestrator's `_parse_audit_output` already handles missing sections 
gracefully — it generates a `_Bundled audit produced no output for this 
section..._` stub for any section the bundle didn't return. That existing path 
is the right place for this case to land. The CouchDB key simply isn't created.
   
   Approximate change in `asvs_bundle.py`:
   
   ```python
   if not all_files:
       print(f"  [bundle] no files in scope for sections {asvs_sections}; 
skipping", flush=True)
       return {"outputText": json.dumps({
           "mode": "bundled",
           "per_section": {},  # empty — orchestrator generates appropriate 
stubs
           "skipped_no_files": True,
       })}
   ```
   
   ## Alternative: Suppress at orchestrator side
   
   If editing `asvs_bundle.py` is undesirable, the `store_one` function in the 
orchestrator could detect stub content (starts with `Error:` or contains `No 
files found in namespaces`) and skip the CouchDB write. Less clean but works 
without touching the bundle agent.
   
   ```python
   async def store_one(section_id, report_text):
       if report_text.startswith("Error:") or "No files found in namespaces" in 
report_text[:300]:
           print(f"    [{pass_name}] {section_id}: empty bundle, skipping 
store", flush=True)
           return section_id, None  # don't fail, but don't write
       try:
           key = f"{pass_name}/{section_id}.md"
           reports_ns.set(key, report_text)
           return section_id, None
       except Exception as e:
           ...
   ```
   
   Pick whichever you prefer; bundle-side is cleaner architecturally, 
orchestrator-side is the smaller change.
   
   ## Validation
   
   1. Re-run task-sdk audit at L3 (the case where `xml_parsing/1.5.1.md` 
produced the stub).
   2. Run `inspect_audit_findings.py` against the resulting namespace — should 
report `empty/stub: 0`.
   3. Verify the consolidated.md no longer has any `Error: No files found` text 
at its head or anywhere else.
   
   ## Related
   
   - Inspection script `/inspect_audit_findings.py` flags these stubs by 
detecting the `Error: No files found in namespaces` prefix in stored values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to