asf-tooling commented on issue #1072:
URL:
https://github.com/apache/tooling-trusted-releases/issues/1072#issuecomment-4409807010
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `bug_fix` • **Classification:** `actionable` • **Confidence:**
`high`
**Application domain(s):** `sbom_analysis`
### Summary
The issue identifies two unbounded loops in `atr/sbom/osv.py`: the
pagination loop in `_paginate_query` (no max page limit) and the sequential
detail-fetching loop in `_scan_bundle_populate_vulnerabilities` (no cap on
unique vulnerabilities fetched, no per-request timeout). Both can exhaust
worker resources when a component has hundreds of vulnerabilities. The code
matches exactly what the issue describes and the fix is straightforward — add
configurable limits and warning logs.
### Where this lives in the code today
#### `atr/sbom/osv.py` — `_paginate_query` (lines 232-255)
_needs modification_
This is the unbounded pagination loop with no maximum page limit or
vulnerability count cap.
```python
async def _paginate_query(
session: aiohttp.ClientSession,
query: dict[str, Any],
page_token: str,
) -> list[models.osv.VulnerabilityDetails]:
all_vulns: list[models.osv.VulnerabilityDetails] = []
current_query = query.copy()
current_query["page_token"] = page_token
page = 0
while True:
page += 1
if _DEBUG and (page > 1):
print(f"[DEBUG] Paginating query (page {page})")
results = await _fetch_vulnerabilities_for_batch(session,
[current_query])
if not results:
break
result = results[0]
if result.vulns:
all_vulns.extend(result.vulns)
next_page_token = result.next_page_token
if next_page_token is None:
break
current_query["page_token"] = next_page_token
return all_vulns
```
#### `atr/sbom/osv.py` — `_scan_bundle_populate_vulnerabilities` (lines
308-324)
_needs modification_
This is the unbounded detail-fetching loop with no limit on unique
vulnerabilities fetched and no per-request timeout.
```python
async def _scan_bundle_populate_vulnerabilities(
session: aiohttp.ClientSession,
component_vulns_map: dict[str, list[models.osv.VulnerabilityDetails]],
) -> None:
details_cache: dict[str, models.osv.VulnerabilityDetails] = {}
for vulns in component_vulns_map.values():
for i, vuln in enumerate(vulns):
vuln_id = vuln.id
if not vuln_id:
continue
details = details_cache.get(vuln_id)
if details is None:
details = await _fetch_vulnerability_details(session,
vuln_id)
details_cache[vuln_id] = details
vulns[i] = details
if _DEBUG:
print(f"[DEBUG] Fetched details for {len(details_cache)} unique
vulnerabilities")
```
#### `atr/sbom/osv.py` — `_fetch_vulnerability_details` (lines 208-217)
_needs modification_
Individual vulnerability detail fetch has no per-request timeout beyond the
session-level timeout.
```python
async def _fetch_vulnerability_details(
session: aiohttp.ClientSession,
vuln_id: str,
) -> models.osv.VulnerabilityDetails:
if _DEBUG:
print(f"[DEBUG] Fetching details for {vuln_id}")
async with session.get(f"{_OSV_API_BASE}/vulns/{vuln_id}") as response:
response.raise_for_status()
data = await response.json()
return models.osv.VulnerabilityDetails.model_validate(data)
```
### Proposed approach
Add three constants at the module level: `_MAX_PAGINATION_PAGES` (limits how
many pages the pagination loop fetches), `_MAX_VULNERABILITIES_PER_COMPONENT`
(stops accumulating vulns once a threshold is reached), and
`_MAX_VULNERABILITY_DETAILS` (caps the total number of unique vulnerability
details fetched). In `_paginate_query`, add checks against
`_MAX_PAGINATION_PAGES` and `_MAX_VULNERABILITIES_PER_COMPONENT` with warning
logs when limits are reached. In `_scan_bundle_populate_vulnerabilities`,
collect unique vulnerability IDs first, truncate to
`_MAX_VULNERABILITY_DETAILS` with a warning log, then fetch details with an
`asyncio.wait_for` wrapper providing a per-request timeout. Use the existing
`atr.log` module for warning messages rather than print statements.
### Suggested patches
#### `atr/sbom/osv.py`
Add pagination limits, vulnerability count caps, and per-request timeouts to
prevent worker resource exhaustion.
````diff
--- a/atr/sbom/osv.py
+++ b/atr/sbom/osv.py
@@ -18,7 +18,9 @@
from __future__ import annotations
+import asyncio
import os
from typing import TYPE_CHECKING, Any, Final
import aiohttp
from packageurl import PackageURL
+import atr.log as log
import atr.util as util
from . import models
@@ -35,6 +37,10 @@
_DEBUG: bool = os.environ.get("DEBUG_SBOM_TOOL") == "1"
_OSV_API_BASE: Final[str] = "https://api.osv.dev/v1"
_API_TIMEOUT: Final = aiohttp.ClientTimeout(total=60, connect=10)
+_MAX_PAGINATION_PAGES: Final[int] = 20
+_MAX_VULNERABILITIES_PER_COMPONENT: Final[int] = 500
+_MAX_VULNERABILITY_DETAILS: Final[int] = 200
+_VULNERABILITY_DETAIL_TIMEOUT: Final[int] = 30 # seconds per detail fetch
_SOURCE_DATABASE_NAMES: Final = {
"ASB": "Android Security Bulletin",
"PUB": "Android Security Bulletin",
@@ -209,6 +215,7 @@
async def _paginate_query(
session: aiohttp.ClientSession,
query: dict[str, Any],
page_token: str,
) -> list[models.osv.VulnerabilityDetails]:
all_vulns: list[models.osv.VulnerabilityDetails] = []
current_query = query.copy()
@@ -218,6 +225,16 @@
while True:
page += 1
+ if page > _MAX_PAGINATION_PAGES:
+ log.warning(
+ f"Reached maximum pagination pages
({_MAX_PAGINATION_PAGES}), stopping pagination for component"
+ )
+ break
+ if len(all_vulns) > _MAX_VULNERABILITIES_PER_COMPONENT:
+ log.warning(
+ f"Component has more than
{_MAX_VULNERABILITIES_PER_COMPONENT} vulnerabilities, truncating results"
+ )
+ break
if _DEBUG and (page > 1):
print(f"[DEBUG] Paginating query (page {page})")
results = await _fetch_vulnerabilities_for_batch(session,
[current_query])
@@ -265,14 +282,28 @@
async def _scan_bundle_populate_vulnerabilities(
session: aiohttp.ClientSession,
component_vulns_map: dict[str, list[models.osv.VulnerabilityDetails]],
) -> None:
details_cache: dict[str, models.osv.VulnerabilityDetails] = {}
+ # Collect all unique vuln IDs first to check limits
+ all_vuln_ids: set[str] = set()
+ for vulns in component_vulns_map.values():
+ for vuln in vulns:
+ if vuln.id:
+ all_vuln_ids.add(vuln.id)
+ if len(all_vuln_ids) > _MAX_VULNERABILITY_DETAILS:
+ log.warning(
+ f"Truncating vulnerability detail fetching from
{len(all_vuln_ids)} to {_MAX_VULNERABILITY_DETAILS}"
+ )
+ allowed_ids = set(list(all_vuln_ids)[:_MAX_VULNERABILITY_DETAILS])
+ else:
+ allowed_ids = all_vuln_ids
for vulns in component_vulns_map.values():
for i, vuln in enumerate(vulns):
vuln_id = vuln.id
- if not vuln_id:
+ if not vuln_id or vuln_id not in allowed_ids:
continue
details = details_cache.get(vuln_id)
if details is None:
- details = await _fetch_vulnerability_details(session,
vuln_id)
+ try:
+ details = await asyncio.wait_for(
+ _fetch_vulnerability_details(session, vuln_id),
+ timeout=_VULNERABILITY_DETAIL_TIMEOUT,
+ )
+ except asyncio.TimeoutError:
+ log.warning(f"Timeout fetching vulnerability details
for {vuln_id}")
+ continue
details_cache[vuln_id] = details
vulns[i] = details
if _DEBUG:
print(f"[DEBUG] Fetched details for {len(details_cache)} unique
vulnerabilities")
````
### Open questions
- What are appropriate default values for the limits? The proposed
_MAX_PAGINATION_PAGES=20, _MAX_VULNERABILITIES_PER_COMPONENT=500,
_MAX_VULNERABILITY_DETAILS=200 seem reasonable but may need tuning based on
real-world OSV query patterns.
- Should the per-detail timeout be 10 seconds (as the issue suggests) or 30
seconds (which is more conservative given the session-level 60s timeout)? I
chose 30 to reduce false timeouts but the team should decide.
- When vulnerabilities are truncated due to _MAX_VULNERABILITY_DETAILS, the
remaining vulnerability entries in component_vulns_map retain their stub data
(just ID and modified date from the batch query). Should these be filtered out
of the final results or kept as partial entries?
- Should these limits be configurable via environment variables or
atr.config rather than hard-coded constants?
### Files examined
- `atr/sbom/osv.py`
- `atr/tasks/sbom.py`
- `atr/sbom/models/osv.py`
- `atr/sbom/cli.py`
- `atr/post/sbom.py`
- `atr/sbom/utilities.py`
### Related issues
This issue appears related to: #1062.
_Both address unbounded response sizes and pagination issues on API
endpoints_
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]