This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git


The following commit(s) were added to refs/heads/main by this push:
     new fe71a0f3 chore(ci): run lychee link-check offline; drop dead sandbox 
network domains (#501)
fe71a0f3 is described below

commit fe71a0f3cada16d3e904ec8a835a5b0d6805561a
Author: Jarek Potiuk <[email protected]>
AuthorDate: Thu Jun 11 23:35:00 2026 +0200

    chore(ci): run lychee link-check offline; drop dead sandbox network domains 
(#501)
    
    * chore(ci): run lychee link-check offline; drop dead sandbox network 
domains
    
    The `lychee` prek hook links macOS SecureTransport (`native-tls`), whose TLS
    handshake fails through the secure-agent sandbox's CONNECT proxy on macOS 26
    (`OSStatus -26276`) even though the certs are valid, there is no MITM, and
    trustd is reachable — so online external-link checking cannot pass 
in-sandbox.
    `enableWeakerNetworkIsolation` no longer rescues it on this OS.
    
    Switch the hook to offline mode (`offline = true` in `.lychee.toml`): it now
    validates only local cross-file and anchor references, which is the in-repo
    reference integrity this hook is really for. External-URL liveness was flaky
    and rate-limited anyway (hence the long ASF-infra `exclude` list) and is no
    longer checked anywhere.
    
    With external link-checking gone, the wildcard link-target domains that were
    allowlisted purely so lychee could reach them (`*.apache.org`, 
`*.anthropic.com`,
    `*.claude.com`, `*.mitre.org`, `*.nist.gov`, `*.github.io`, 
`gist.github.com`,
    `astral.sh`, `json.schemastore.org`, `lychee.cli.rs`, `sdkman.io`) are dead
    weight — drop them from the sandbox allowlist. Kept: `*.crates.io` +
    `static.rust-lang.org` (still needed to build lychee) and
    `enableWeakerNetworkIsolation` (gh / gcloud / Go-tool TLS, per the schema).
    
    - .lychee.toml: offline = true; header rewritten to local-references-only
    - .claude/settings.json: drop the 11 dead lychee link-target domains
    - docs/setup/secure-agent-setup.md (isolation-setup template): same domain
      removal; add ~/.rustup + ~/.cargo write/read and static.rust-lang.org so a
      fresh in-sandbox setup can actually build lychee's toolchain; replace the
      excludedCommands/TLS workaround note with an offline-mode explanation
    
    Generated-by: Claude Code (Opus 4.8)
    
    * fix(sandbox-lint): drop the dead lychee domains from the baseline too
    
    The sandbox-lint M.29 invariant requires `.claude/settings.json` and
    `tools/sandbox-lint/expected.json` to stay in lockstep (two files, two
    edits, one review surface). The prior commit removed the 11 dead lychee
    link-target domains from the live settings but not the baseline, so the
    `sandbox-lint` CLI and its `test_baseline_file_matches_live_settings` /
    `test_main_exits_zero_on_repo` tests failed in CI. Mirror the same
    removal in expected.json.
    
    Generated-by: Claude Code (Opus 4.8)
---
 .claude/settings.json            | 13 +------------
 .lychee.toml                     | 24 ++++++++++++++++++++++--
 docs/setup/secure-agent-setup.md | 34 +++++++++++++++++++++++++++-------
 tools/sandbox-lint/expected.json | 13 +------------
 4 files changed, 51 insertions(+), 33 deletions(-)

diff --git a/.claude/settings.json b/.claude/settings.json
index 63fa217f..91db81e3 100644
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -43,18 +43,7 @@
         "cveawg.mitre.org",
         "oauth2.googleapis.com",
         "gmail.googleapis.com",
-        "*.crates.io",
-        "*.apache.org",
-        "*.anthropic.com",
-        "*.claude.com",
-        "*.mitre.org",
-        "*.nist.gov",
-        "*.github.io",
-        "gist.github.com",
-        "astral.sh",
-        "json.schemastore.org",
-        "lychee.cli.rs",
-        "sdkman.io"
+        "*.crates.io"
       ],
       "enableWeakerNetworkIsolation": true
     }
diff --git a/.lychee.toml b/.lychee.toml
index 98985137..07a5aed2 100644
--- a/.lychee.toml
+++ b/.lychee.toml
@@ -1,9 +1,13 @@
 # Lychee link checker config for apache/airflow-steward.
 #
-# Validates every link in markdown / rst / .md.j2 files:
+# Runs in OFFLINE mode (see `offline` below): validates only *local*
+# references in markdown / rst / .md.j2 files:
 #   * cross-file file existence — `[text](other.md)`
 #   * cross-file fragments     — `[text](other.md#anchor)`
-#   * external URLs            — HTTP 2xx
+#   * same-file fragments      — `[text](#anchor)`
+# External `http(s)://` URLs are intentionally NOT fetched — see the
+# `offline` note below for why. Remote-link liveness is not checked
+# anywhere; the link check exists to keep in-repo references intact.
 #
 # Run via prek (locally and in CI) as the `lychee` hook in
 # `.pre-commit-config.yaml` — prek installs lychee itself, so no local
@@ -21,6 +25,22 @@
 # `lychee-action`. The v0.23.x boolean form (`true`) no longer parses.
 include_fragments = "anchor-only"
 
+# Offline mode — check only local file/anchor references, never fetch
+# remote URLs. Two reasons:
+#   1. Scope: this hook's job is in-repo reference integrity, not
+#      external-link liveness (which is flaky and rate-limited — note
+#      the long `exclude` list of ASF infra hosts below that existed
+#      purely to tame online checking).
+#   2. Sandbox compatibility: the cargo/brew lychee links macOS
+#      SecureTransport (`native-tls`), whose TLS handshake fails
+#      through the secure-agent sandbox's CONNECT proxy on macOS 26
+#      (`OSStatus -26276`) even though certs are valid. Offline mode
+#      makes no network calls, so the hook passes cleanly in-sandbox.
+# The network-related settings below (timeout / retry / accept /
+# exclude / cache) are dormant while offline = true, kept for
+# reference / a future opt-in online check.
+offline = true
+
 # Concurrency cap — kept moderate to avoid being rate-limited by GitHub.
 max_concurrency = 14
 
diff --git a/docs/setup/secure-agent-setup.md b/docs/setup/secure-agent-setup.md
index 07037183..f5277c2c 100644
--- a/docs/setup/secure-agent-setup.md
+++ b/docs/setup/secure-agent-setup.md
@@ -355,6 +355,18 @@ below, annotated.
 {
   "sandbox": {
     "enabled": true,
+    // The `lychee` link-check hook runs in OFFLINE mode (`offline =
+    // true` in `.lychee.toml`): it validates only local cross-file and
+    // anchor references and never fetches remote URLs, so it makes no
+    // network calls and needs no in-sandbox TLS at runtime. This
+    // sidesteps a macOS-26 issue where the sandbox's CONNECT proxy is
+    // incompatible with SecureTransport (the `native-tls` stack the
+    // cargo/brew lychee links): online link checks fail every external
+    // URL with `OSStatus -26276` even though the certs are valid and
+    // `enableWeakerNetworkIsolation` is set. Building lychee still
+    // needs the rust toolchain (see the `~/.rustup`/`~/.cargo` +
+    // `*.crates.io`/`static.rust-lang.org` entries below); only its
+    // *runtime* network use is eliminated.
     "filesystem": {
       "denyRead": ["~/"],          // default-deny the entire home dir for 
Bash subprocesses
       "allowRead": [
@@ -364,6 +376,8 @@ below, annotated.
         "~/.config/gh/",              // gh CLI auth (token in hosts.yml)
         "~/.cache/",                  // dev tool caches (uv HTTP cache, prek 
logs, ruff/mypy caches)
         "~/.local/share/uv/",         // uv's tool venvs (prek, etc.)
+        "~/.rustup/",                 // rustup toolchains (the `lychee` rust 
hook builds against them)
+        "~/.cargo/",                  // cargo registry + the lychee binary 
the rust hook installs
         "~/.local/bin/",              // uv-installed tool entry points
         "~/.config/apache-magpie/",  // Gmail OAuth refresh token (oauth-draft 
tool)
         "~/.gnupg/",                  // gpg keys (commit signing)
@@ -371,7 +385,9 @@ below, annotated.
       ],
       "allowWrite": [
         "~/.cache/",                  // uv lock files, prek log + state, 
ruff/mypy caches
-        "~/.local/share/uv/"          // uv's tool venvs (prek installs new 
hook envs here)
+        "~/.local/share/uv/",         // uv's tool venvs (prek installs new 
hook envs here)
+        "~/.rustup/",                 // rustup writes settings.toml + 
downloaded toolchains (first run of the `lychee` rust hook)
+        "~/.cargo/"                   // cargo registry cache + the compiled 
lychee binary
       ]
     },
     "network": {
@@ -382,12 +398,16 @@ below, annotated.
         "lists.apache.org", "dist.apache.org", "downloads.apache.org", 
"archive.apache.org",
         "cveprocess.apache.org", "cve.org", "www.cve.org", "cveawg.mitre.org",
         "oauth2.googleapis.com", "gmail.googleapis.com",
-        // Added with the `lychee` link-check prek hook: the hosts the
-        // framework's own docs link to (so lychee passes in-sandbox)
-        // plus `*.crates.io` (so the rust hook can `cargo install` lychee).
-        "*.crates.io", "*.apache.org", "*.anthropic.com", "*.claude.com",
-        "*.mitre.org", "*.nist.gov", "*.github.io", "gist.github.com",
-        "astral.sh", "json.schemastore.org", "lychee.cli.rs", "sdkman.io"
+        // `*.crates.io` + `static.rust-lang.org` let the `lychee` rust
+        // hook bootstrap a rustup toolchain and `cargo install` lychee
+        // on first run (rustup downloads the toolchain from
+        // static.rust-lang.org; crate deps come from crates.io). These
+        // are the ONLY hosts lychee needs: it runs offline (see
+        // `.lychee.toml`), so it never fetches the external URLs the
+        // docs link to — the wildcard link-target hosts that used to
+        // live here (`*.apache.org`, `*.nist.gov`, `lychee.cli.rs`, …)
+        // were removed when the hook went offline.
+        "*.crates.io", "static.rust-lang.org"
       ],
       // Lets native-TLS CLI tools (lychee — and, per the schema, gh /
       // gcloud / terraform) verify TLS through the sandbox's
diff --git a/tools/sandbox-lint/expected.json b/tools/sandbox-lint/expected.json
index 63fa217f..91db81e3 100644
--- a/tools/sandbox-lint/expected.json
+++ b/tools/sandbox-lint/expected.json
@@ -43,18 +43,7 @@
         "cveawg.mitre.org",
         "oauth2.googleapis.com",
         "gmail.googleapis.com",
-        "*.crates.io",
-        "*.apache.org",
-        "*.anthropic.com",
-        "*.claude.com",
-        "*.mitre.org",
-        "*.nist.gov",
-        "*.github.io",
-        "gist.github.com",
-        "astral.sh",
-        "json.schemastore.org",
-        "lychee.cli.rs",
-        "sdkman.io"
+        "*.crates.io"
       ],
       "enableWeakerNetworkIsolation": true
     }

Reply via email to