potiuk opened a new pull request, #499:
URL: https://github.com/apache/airflow-steward/pull/499

   ## Summary
   
   Converts the standalone **`link-check.yml`** workflow into a **`lychee` prek 
hook**, and broadens the default sandbox network settings enough for that hook 
to actually run inside the secure sandbox.
   
   **Link check → prek hook**
   - `.pre-commit-config.yaml`: new `lychee` hook, **`language: rust`** with 
`additional_dependencies: ["cli:lychee"]` — prek installs lychee itself 
(cargo), so the hook **does not depend on a locally-installed lychee**, and the 
prek CI job needs no extra install step. Commit-stage, whole-repo scan, gated 
on `.md`/`.rst`/`.j2`; tracks latest lychee.
   - `.lychee.toml`: `include_fragments` `true` → `"anchor-only"` (the lychee 
v0.24+ enum form; the boolean no longer parses).
   - `.asf.yaml`: drops the now-dead `lychee` required status check — the 
required **`prek`** context covers it. Deletes `link-check.yml`.
   - `pre-commit.yml`: caches prek hook envs (so lychee isn't recompiled every 
run), restores the lychee result cache, and passes `GITHUB_TOKEN` so github 
link checks aren't rate-limited.
   
   **Broadened default `sandbox.network` settings** (mirrored into the 
sandbox-lint baseline per mitigation M.29)
   - Curated **wildcard `allowedDomains`** covering the hosts the framework's 
own docs reach (`*.apache.org`, `*.anthropic.com`, `*.claude.com`, 
`*.mitre.org`, `*.nist.gov`, `*.github.io`, `astral.sh`, 
`json.schemastore.org`, `lychee.cli.rs`, `sdkman.io`, `gist.github.com`) + 
`*.crates.io` for the cargo install.
   - **`enableWeakerNetworkIsolation: true`** so native-TLS CLI tools (lychee — 
and per the schema also `gh`/`gcloud`/`terraform`) can verify TLS through the 
sandbox's TLS-terminating proxy.
   
   ## ⚠️ Security review note
   
   This PR **loosens the default sandbox posture** and should be reviewed as 
such:
   - `enableWeakerNetworkIsolation: true` carries the schema's documented 
trade-off — it "reduces security — opens a potential data-exfiltration vector 
through the trustd service." Without it, lychee fails every external link with 
`failed to verify TLS certificate` in-sandbox. It is a **no-op outside the 
sandbox** (e.g. CI). The `setup-isolated-setup-update` skill surfaces it **with 
this caveat** so adopters opt in consciously rather than silently inheriting it.
   - The allowlist additions are wildcards over ASF / Anthropic / MITRE / NIST 
/ a few dev-tool hosts. The baseline (`tools/sandbox-lint/expected.json`) is 
updated so the change is explicit and lint-gated.
   
   ## Type of change
   
   - [x] CI / dev loop (prek, workflows)
   - [x] Cross-cutting (sandbox / settings.json / threat-model M.29)
   - [x] Documentation (`secure-agent-setup.md`, `AGENTS.md`, `CONTRIBUTING.md`)
   - [x] Skill change (`setup-isolated-setup-update` drift check)
   
   ## Test plan
   
   - [x] `lychee --config .lychee.toml .` whole-repo, online, with the curated 
allowlist + `enableWeakerNetworkIsolation` active → **0 errors, 4823 OK** (TLS 
verified through the proxy). With `--insecure` (allowlist only) also 0 errors, 
confirming the host set is exactly sufficient.
   - [x] `sandbox-lint` passes (baseline mirrors live settings; invariants 
satisfied).
   - [x] `prek run --all-files` green for the touched hooks (markdownlint, 
typos, doctoc, check-placeholders, skill-and-tool-validate).
   - [ ] The `lychee` hook itself was **SKIPped on the authoring machine's 
local commit** (`enableWeakerNetworkIsolation` needs a Claude Code restart to 
activate, and the sandbox blocks writing the protected `settings.json` during 
the EOF-fixer hooks). **CI's `prek` job runs the real lychee check** — that's 
the authoritative gate.
   
   ## Notes for reviewers
   
   - Dropping `link-check.yml` also drops its **daily cron rot sweep** — link 
rot on files no PR touches is now only caught when a PR next edits them. Called 
out in `.asf.yaml`.
   - The redundant specific apache/mitre entries left in `allowedDomains` (now 
covered by the wildcards) were kept to minimise the diff; happy to dedupe if 
preferred.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to