ppkarwasz opened a new pull request, #26:
URL: https://github.com/apache/logging-site/pull/26
This change replaces our hand-maintained `src/site/static/cyclonedx/vdr.xml`
with a generated artifact assembled from one source file per `(CVE, component)`
pair under `src/vulnerabilities/`.
To regenerate the VDR after editing any per-CVE file:
```
uv run scripts/vdr_aggregate.py
```
To split an existing monolithic VDR back into per-CVE files (one-time
migration, or recovery):
```
uv run scripts/vdr_split.py
```
## Why
The current hand-edited VDR is becoming hard to maintain reliably:
1. **Timestamps drift.** In the latest release we forgot to bump
`metadata.timestamp` to the max of every `vulnerability.updated`. The
aggregator now computes this automatically.
2. **Ordering is hard to keep straight.** Vulnerabilities in the file are
not strictly sorted, and components are listed in an ad-hoc order. The
aggregator enforces deterministic order: vulnerabilities by `(year DESC, number
DESC)`, components alphabetically by `bom-ref`.
3. **Merge conflicts on simultaneous additions.** Adding seven
vulnerabilities in a single batch (as in the most recent disclosure) is
error-prone. Per-CVE files let contributors add or edit vulnerabilities
independently.
## How it works
Each vulnerability lives in its own file at
`src/vulnerabilities/<CVE-id>/<component>.cdx.xml`: a self-contained CycloneDX
1.7 BOM with the affected component as `metadata.component` and a single
`<vulnerability>` element. `log4cxx-conan` never gets its own file; its
vulnerabilities ride along in the corresponding `log4cxx` file via a
`<components>` entry plus a `<dependencies>` edge.
`vdr_aggregate.py` walks every per-CVE file, dedupes components by
`bom-ref`, dedupes vulnerabilities by CVE id, and emits the monolithic
`vdr.xml`. `vdr_split.py` performs the inverse for migration. Both scripts
share `vdr_common.py` (constants, namespace handling, comparison,
write-if-changed orchestration).
### Idempotent writes
Both scripts read the existing output's `serialNumber` and `version`, build
a candidate at the existing version, and compare it to the file on disk via a
structural comparison that ignores comments, inter-element whitespace, and
namespace prefixes. If the candidate is equivalent, the file is left untouched:
no diff, no version churn. If it differs, the version is bumped by one and the
file is rewritten.
This means re-running either script in a clean tree is a no-op, and a
content edit produces exactly one version bump per affected file.
## Why split per (CVE, component), beyond automation
1. **Path to VEX.** A monolithic VDR has no meaningful `metadata.component`,
since it covers many subjects. Per-component files let `metadata.component`
name the analyzing project (e.g. `log4j-core`), with the vulnerable dependency
in `vulnerability.affects` and the dependency path in `<dependencies>`. That's
the shape required for VEX, CSAF, and OpenVEX, so we can grow into those
formats without restructuring our source of truth.
2. **Easier asciidoc generation.** Per-CVE files let `_vulnerabilities.adoc`
be assembled from one generated partial per CVE, instead of a single monolithic
AsciiDoc.
We can also later decide to have a separate page per CVE.
## Repository layout
```
scripts/
vdr_common.py # shared helpers (constants, clone, serialize,
equivalent, write_bom_if_changed)
vdr_aggregate.py # per-CVE files -> vdr.xml
vdr_split.py # vdr.xml -> per-CVE files
src/vulnerabilities/
CVE-2017-5645/log4j-core.cdx.xml
CVE-2018-1285/log4net.cdx.xml
...
template.cdx.xml # editable template for new CVEs
src/site/static/cyclonedx/vdr.xml # generated output
```
## Notes for reviewers
- The aggregated `vdr.xml` is now CycloneDX 1.7 (was 1.6). The bump is
intentional: 1.x is semantically versioned, and the structural change is just a
namespace rename.
- Component ordering changed: alphabetical by `bom-ref`, so `log4j-1.2-api`
now precedes `log4j-core`.
- The aggregated header comment now warns it's generated and points at `uv
run scripts/vdr_aggregate.py` for updates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]