andrewmusselman opened a new issue, #28:
URL: https://github.com/apache/tooling-agents/issues/28
## Summary
When the bundle agent's relevance/file-filter step produces zero matching
files for a section, it writes an error stub to the reports namespace instead
of returning silently. Consolidate Phase 1 reads these stubs as if they were
findings, prepending the error string to the consolidated.md output.
## Symptom
Inspection of the task-sdk run namespace:
```
empty/stub: 1
- xml_parsing/1.5.1.md
```
The file's contents start with:
```
Error: No files found in namespaces ['files:apache/airflow/task-sdk']
```
This stub then ends up at the head of consolidate's input, producing this in
the early consolidated output (before our other fixes scrubbed it):
```
Error: No files found in namespaces ['files:apache/airflow/task-sdk']
---
# Security Audit Report: ASVS 3.2.1
...
```
Cosmetic in impact — doesn't break the audit — but messy and confusing when
debugging.
## Cause
In `asvs_bundle.py`, when the file-filter step returns an empty list, the
agent currently constructs a JSON envelope with the error message and emits it
as a per-section report. The orchestrator's `store_one` then dutifully writes
it to CouchDB under the section's key.
## Proposed Fix
When `all_files` is empty after filtering:
1. Log the condition with the section ID and the namespaces queried.
2. Return a result object indicating "skipped: no files in scope" — distinct
from "audited and found nothing".
3. Do **not** include the section in the per-section JSON envelope passed
back to the orchestrator.
The orchestrator's `_parse_audit_output` already handles missing sections
gracefully — it generates a `_Bundled audit produced no output for this
section..._` stub for any section the bundle didn't return. That existing path
is the right place for this case to land. The CouchDB key simply isn't created.
Approximate change in `asvs_bundle.py`:
```python
if not all_files:
print(f" [bundle] no files in scope for sections {asvs_sections};
skipping", flush=True)
return {"outputText": json.dumps({
"mode": "bundled",
"per_section": {}, # empty — orchestrator generates appropriate
stubs
"skipped_no_files": True,
})}
```
## Alternative: Suppress at orchestrator side
If editing `asvs_bundle.py` is undesirable, the `store_one` function in the
orchestrator could detect stub content (starts with `Error:` or contains `No
files found in namespaces`) and skip the CouchDB write. Less clean but works
without touching the bundle agent.
```python
async def store_one(section_id, report_text):
if report_text.startswith("Error:") or "No files found in namespaces" in
report_text[:300]:
print(f" [{pass_name}] {section_id}: empty bundle, skipping
store", flush=True)
return section_id, None # don't fail, but don't write
try:
key = f"{pass_name}/{section_id}.md"
reports_ns.set(key, report_text)
return section_id, None
except Exception as e:
...
```
Pick whichever you prefer; bundle-side is cleaner architecturally,
orchestrator-side is the smaller change.
## Validation
1. Re-run task-sdk audit at L3 (the case where `xml_parsing/1.5.1.md`
produced the stub).
2. Run `inspect_audit_findings.py` against the resulting namespace — should
report `empty/stub: 0`.
3. Verify the consolidated.md no longer has any `Error: No files found` text
at its head or anywhere else.
## Related
- Inspection script `/inspect_audit_findings.py` flags these stubs by
detecting the `Error: No files found in namespaces` prefix in stored values.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]