potiuk opened a new pull request, #499: URL: https://github.com/apache/airflow-steward/pull/499
## Summary Converts the standalone **`link-check.yml`** workflow into a **`lychee` prek hook**, and broadens the default sandbox network settings enough for that hook to actually run inside the secure sandbox. **Link check → prek hook** - `.pre-commit-config.yaml`: new `lychee` hook, **`language: rust`** with `additional_dependencies: ["cli:lychee"]` — prek installs lychee itself (cargo), so the hook **does not depend on a locally-installed lychee**, and the prek CI job needs no extra install step. Commit-stage, whole-repo scan, gated on `.md`/`.rst`/`.j2`; tracks latest lychee. - `.lychee.toml`: `include_fragments` `true` → `"anchor-only"` (the lychee v0.24+ enum form; the boolean no longer parses). - `.asf.yaml`: drops the now-dead `lychee` required status check — the required **`prek`** context covers it. Deletes `link-check.yml`. - `pre-commit.yml`: caches prek hook envs (so lychee isn't recompiled every run), restores the lychee result cache, and passes `GITHUB_TOKEN` so github link checks aren't rate-limited. **Broadened default `sandbox.network` settings** (mirrored into the sandbox-lint baseline per mitigation M.29) - Curated **wildcard `allowedDomains`** covering the hosts the framework's own docs reach (`*.apache.org`, `*.anthropic.com`, `*.claude.com`, `*.mitre.org`, `*.nist.gov`, `*.github.io`, `astral.sh`, `json.schemastore.org`, `lychee.cli.rs`, `sdkman.io`, `gist.github.com`) + `*.crates.io` for the cargo install. - **`enableWeakerNetworkIsolation: true`** so native-TLS CLI tools (lychee — and per the schema also `gh`/`gcloud`/`terraform`) can verify TLS through the sandbox's TLS-terminating proxy. ## ⚠️ Security review note This PR **loosens the default sandbox posture** and should be reviewed as such: - `enableWeakerNetworkIsolation: true` carries the schema's documented trade-off — it "reduces security — opens a potential data-exfiltration vector through the trustd service." Without it, lychee fails every external link with `failed to verify TLS certificate` in-sandbox. It is a **no-op outside the sandbox** (e.g. CI). The `setup-isolated-setup-update` skill surfaces it **with this caveat** so adopters opt in consciously rather than silently inheriting it. - The allowlist additions are wildcards over ASF / Anthropic / MITRE / NIST / a few dev-tool hosts. The baseline (`tools/sandbox-lint/expected.json`) is updated so the change is explicit and lint-gated. ## Type of change - [x] CI / dev loop (prek, workflows) - [x] Cross-cutting (sandbox / settings.json / threat-model M.29) - [x] Documentation (`secure-agent-setup.md`, `AGENTS.md`, `CONTRIBUTING.md`) - [x] Skill change (`setup-isolated-setup-update` drift check) ## Test plan - [x] `lychee --config .lychee.toml .` whole-repo, online, with the curated allowlist + `enableWeakerNetworkIsolation` active → **0 errors, 4823 OK** (TLS verified through the proxy). With `--insecure` (allowlist only) also 0 errors, confirming the host set is exactly sufficient. - [x] `sandbox-lint` passes (baseline mirrors live settings; invariants satisfied). - [x] `prek run --all-files` green for the touched hooks (markdownlint, typos, doctoc, check-placeholders, skill-and-tool-validate). - [ ] The `lychee` hook itself was **SKIPped on the authoring machine's local commit** (`enableWeakerNetworkIsolation` needs a Claude Code restart to activate, and the sandbox blocks writing the protected `settings.json` during the EOF-fixer hooks). **CI's `prek` job runs the real lychee check** — that's the authoritative gate. ## Notes for reviewers - Dropping `link-check.yml` also drops its **daily cron rot sweep** — link rot on files no PR touches is now only caught when a PR next edits them. Called out in `.asf.yaml`. - The redundant specific apache/mitre entries left in `allowedDomains` (now covered by the wildcards) were kept to minimise the diff; happy to dedupe if preferred. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
