andrewmusselman opened a new issue, #10: URL: https://github.com/apache/tooling-gofannon/issues/10
# Gofannon Dependabot remediation — application playbook With .asf.yaml added we got 123 alerts https://github.com/apache/tooling-gofannon/security/dependabot Three PRs land in order: **A → B → C**. Each closes a chunk of the 123 open alerts on `apache/tooling-gofannon` without touching production deploy posture or adding new infrastructure. **Scope explicitly excluded:** publishing the Docusaurus website to `apache.github.io/tooling-gofannon`. Not in this batch. > **Read this whole document before starting.** The supply-chain caveats in §0 affect every step. --- ## §0 — Supply-chain protections before you start These two npm ecosystem incidents are recent: a malicious-payload-as-dead-man's-switch reported on a GitHub issue ~2 hours before this playbook was written, and an active Ruby gem ecosystem compromise. The protections below assume that **any fresh npm version could be malicious**, and reduce blast radius if one is. ### 0a. Run all `pnpm install` / `npm install` commands inside a throwaway container Not on your laptop. Each PR's regen step gets a fresh container: ```bash docker run --rm -it \ -v "$(pwd)":/work \ -w /work \ --network host \ node:20-alpine sh # inside the container, install pnpm if needed: npm install -g [email protected] # ... run the regen commands per the PR below ... # Exit when done. The container is destroyed. The lockfile changes # remain in your working tree, but any malicious postinstall payload # is gone with the container. exit ``` Use `--network host` only if your network requires it (proxy, VPN); otherwise leave it off so the container's network is also isolated. ### 0b. Disable install scripts at the regen step The pnpm changes in PR A include `"pnpm": { "onlyBuiltDependencies": [] }`, which tells pnpm 10 not to run install scripts for any package. pnpm will *warn* about packages that wanted to run scripts (typically `esbuild`, `core-js`, etc.). If a warning is for a package you trust, add it to the allowlist and re-install. For PR C (npm-based website), explicitly pass `--ignore-scripts` to `npm install`. No persistent `.npmrc` because that risks breaking production builds elsewhere. ### 0c. Freshness check before each regen Four versions we pin are less than two weeks old. **Before running install, re-verify they haven't been pulled or marked malicious:** ```bash for pkg_ver in \ "@babel/[email protected]" \ "[email protected]" \ "@protobufjs/[email protected]" \ "[email protected]"; do pkg="${pkg_ver%@*}" ver="${pkg_ver##*@}" # If package was unpublished, this returns 404 curl -sI "https://registry.npmjs.org/${pkg}/${ver}" | head -1 done ``` All should return `200 OK`. A 404 means the package was unpublished — back away and revise the override. Also check the GitHub advisory database for each, e.g.: - https://github.com/advisories?query=protobufjs - https://socket.dev/npm/package/protobufjs If any of these surface advisories *against the version you're pinning* (as opposed to against older versions), come back here before proceeding. ### 0d. Verify lockfile integrity after regen After every regen, run a signature audit: ```bash # pnpm pnpm audit signatures # npm npm audit signatures ``` Any unsigned or invalid-signature packages = red flag. --- ## §1 — PR A: stale-lockfile deletion + pnpm pin bump **Closes ~50 alerts. Zero runtime risk.** ### Apply ```bash # In your tooling-gofannon checkout, on a fresh feature branch off main: git switch -c security/pr-a-cleanup-pnpm-pin # Delete the stale npm lockfile (32 alerts gone) git rm webapp/packages/webui/package-lock.json # Drop the modified files cp path/to/pr-bundles/pr-a/webapp/package.json webapp/package.json cp path/to/pr-bundles/pr-a/webapp/infra/docker/Dockerfile.webui \ webapp/infra/docker/Dockerfile.webui ``` ### Regen lockfile (in container) ```bash docker run --rm -it -v "$(pwd)":/work -w /work node:20-alpine sh # inside: npm install -g [email protected] cd webapp pnpm install # pnpm 10 auto-skips install scripts because onlyBuiltDependencies is [] pnpm audit signatures # should report "0 issues" exit ``` If pnpm complains about packages wanting to run scripts, note which ones in case you need to allowlist for production builds (most common: `esbuild`, `@swc/core`, native build tools). For now, no allowlist needed — installs succeed without scripts. ### Test ```bash cd webapp pnpm --filter webui build # webui still builds pnpm --filter webui test # unit tests pass ``` Build a fresh webui image to confirm Dockerfile changes work: ```bash docker build \ -f webapp/infra/docker/Dockerfile.webui \ -t gofannon-webui-test \ webapp/ ``` ### Commit & PR ```bash git add -A git commit -m "chore(security): remove stale package-lock.json, bump pnpm to 10.28.2 - Delete webapp/packages/webui/package-lock.json (workspace uses pnpm; npm lockfile was a stale artifact). Closes 32 Dependabot alerts on that file. - Bump pnpm pin from ^8.0.0 to 10.28.2 (exact). Closes 18 alerts across webapp/package.json and pnpm-lock.yaml. - Bump node engine from >=18.0.0 to >=20.0.0 (18 EOL, firebase requires 20). - Pin pnpm version in Dockerfile.webui for reproducible builds." git push origin security/pr-a-cleanup-pnpm-pin gh pr create --base main --title "chore(security): cleanup stale lockfile + bump pnpm" ``` ### Verify after merge GitHub Dependabot rescans on each push to a branch with open alerts. Within a few minutes of merge, the Security tab should show ~50 fewer open alerts. If any of the alerts on the deleted file remain open, dismiss them manually as "fix released" (the file no longer exists). --- ## §2 — PR B: webapp direct dep bumps + transitive overrides **Closes ~53 more alerts. Low runtime risk: minor-version bumps on axios, react-router-dom, vite. The critical protobufjs fix lands here.** Apply *after PR A is merged*. ### Apply ```bash git switch main && git pull git switch -c security/pr-b-webapp-bumps cp path/to/pr-bundles/pr-b/webapp/package.json webapp/package.json cp path/to/pr-bundles/pr-b/webapp/packages/webui/package.json \ webapp/packages/webui/package.json ``` ### Freshness re-check (5 seconds, do this NOW) ```bash # from §0c — re-verify the four young versions are still up and clean for v in \ "@babel/[email protected]" \ "[email protected]" \ "@protobufjs/[email protected]" \ "[email protected]"; do curl -sI "https://registry.npmjs.org/${v%@*}/${v##*@}" | head -1 done ``` All `200 OK`? Proceed. Anything else? Stop and ask. ### Regen lockfile (in container) ```bash docker run --rm -it -v "$(pwd)":/work -w /work node:20-alpine sh # inside: npm install -g [email protected] cd webapp pnpm install pnpm audit signatures exit ``` If pnpm errors on override conflicts (most likely candidates: `yaml` or `minimatch` where multiple majors are in legitimate use), the error will name the conflicting consumer. Two fixes: 1. **Narrow the override to a specific range.** Edit `webapp/package.json`: ```diff - "yaml": "1.10.3", + "yaml@^1": "1.10.3", ``` This pins only v1.x consumers; v2.x consumers keep their existing pin. 2. **Drop the problematic override entirely** and let that package's alerts stay open until upstream fixes them. ### Test This is the PR where things can break. axios 1.12→1.15 and react-router-dom 7.9→7.12 crossed minor boundaries. ```bash cd webapp pnpm --filter webui test # unit tests pnpm --filter webui build # production build pnpm run test:e2e || true # E2E (may already be flaky; eyeball the failures) # Smoke test by running the dev stack and clicking through: ./dev-tail.sh # or however you launch the local stack # Open http://localhost:3000, log in, run a tiny agent, hit a deployed endpoint. ``` Specific things to manually verify: - [ ] Login flow works (axios is involved in auth requests). - [ ] In-app navigation between routes works (react-router-dom). - [ ] A pipeline run completes (no protobufjs/firebase regression in auth or storage paths). - [ ] No console errors that weren't there before. ### Commit & PR ```bash git add -A git commit -m "chore(security): bump webapp deps + pin transitive overrides Direct bumps: - axios ^1.12.2 -> 1.15.2 (exact pin) - react-router-dom ^7.9.3 -> 7.12.0 (exact pin) - vite ^7.1.7 -> 7.3.2 (exact pin) Pnpm overrides (closes critical protobufjs + transitive ReDoS chains): - protobufjs 7.5.6 (critical: arbitrary code execution) - @protobufjs/utf8 1.1.1 - minimatch 9.0.7, picomatch 4.0.4, flatted 3.4.2 - follow-redirects 1.16.0, brace-expansion 2.0.3 - yaml 1.10.3, js-yaml 4.1.1, postcss 8.5.10, ajv 6.14.0 - rollup 4.59.0, esbuild 0.25.0 All versions exact-pinned (no caret) to avoid silent uptake of potentially-compromised patch releases. Lifecycle scripts disabled during regen via pnpm.onlyBuiltDependencies=[]." git push origin security/pr-b-webapp-bumps gh pr create --base main --title "chore(security): webapp dep bumps + transitive overrides" ``` ### Verify after merge Dependabot rescan should leave only the 20 website alerts open. --- ## §3 — PR C: website transitive overrides **Closes the remaining 20 alerts.** Independent of PRs A and B. ### Apply ```bash git switch main && git pull git switch -c security/pr-c-website-overrides cp path/to/pr-bundles/pr-c/website/package.json website/package.json ``` ### Freshness re-check Same four packages as §2 affect this manifest too (fast-uri, @babel/plugin-transform-modules-systemjs): ```bash for v in "@babel/[email protected]" "[email protected]"; do curl -sI "https://registry.npmjs.org/${v%@*}/${v##*@}" | head -1 done ``` ### Regen lockfile (in container) ```bash docker run --rm -it -v "$(pwd)":/work -w /work node:20-alpine sh # inside: cd website npm install --ignore-scripts npm audit signatures exit ``` `--ignore-scripts` skips lifecycle hooks during this install only — no persistent change to behavior. ### Test ```bash cd website npm run build # writes website/build/ npm run serve # serves at http://localhost:3000 — eyeball the rendered docs ``` Manually check: - [ ] Site builds without errors. - [ ] Pages render. - [ ] No broken images or broken internal links (Docusaurus warns about these during build). ### Commit & PR ```bash git add -A git commit -m "chore(security): pin website transitive deps via npm overrides All 20 alerts on website/package-lock.json are transitive deps of @docusaurus/core / @docusaurus/preset-classic. Pin via npm overrides (exact versions, no caret) to close them without waiting for a docusaurus 3.x release that bumps these: - @babel/plugin-transform-modules-systemjs 7.29.4 - fast-uri 3.1.2, postcss 8.5.10, follow-redirects 1.16.0 - lodash 4.18.0, path-to-regexp 0.1.13, svgo 3.3.3 - serialize-javascript 7.0.5, minimatch 3.1.4, picomatch 2.3.2 - brace-expansion 1.1.13, ajv 8.18.0, qs 6.14.2 Lockfile regen used npm install --ignore-scripts to bound supply-chain exposure during dependency resolution." git push origin security/pr-c-website-overrides gh pr create --base main --title "chore(security): website transitive overrides" ``` ### Verify after merge Dependabot rescan: zero open alerts. Done. --- ## §4 — Followups (not in this batch) **Pin Python deps.** `webapp/packages/api/user-service/requirements.txt` has unpinned versions for most packages. Dependabot can't compare CVEs against unpinned ranges, so the 0 Python alerts is misleading. Recommend a separate PR using `pip-compile` (pip-tools) or `uv lock` to produce a pinned `requirements.txt` from a small `requirements.in`. Once pinned, Dependabot will start flagging Python CVEs the same way it does npm. **Website publishing.** Held off per your instruction. When ready, that's a separate PR adding: - `.asf.yaml` ghp settings - A `.github/workflows/deploy-website.yml` - `website/docusaurus.config.ts` updates for `apache.github.io/tooling-gofannon` **Periodic Dependabot review cadence.** With this round done, consider a monthly Dependabot triage on the Security tab. Most alerts are noise that bumps in a 5-minute sweep; the occasional critical (like the protobufjs one this round) wants attention faster. --- ## Appendix — If something goes sideways **`pnpm install` errors with peer-dep conflict on an override.** Narrow the override range as described in §2. If you can't make it work, drop the override and accept the open alert until upstream fixes. **`pnpm audit signatures` reports invalid signatures.** Stop. Don't commit the lockfile. Surface the package to Carlini's GitHub thread, the npm security team ([email protected]), and the gofannon project list. **A freshness check (§0c) returns 404 or a redirect.** Package was likely unpublished. Drop the override or pin to the next-older patched version. Check the GitHub advisory database for context. **Production build breaks because a package needed an install script.** pnpm 10's `onlyBuiltDependencies: []` is too aggressive for a package that needs to compile native code. Add the specific package to the allowlist: ```json "pnpm": { "onlyBuiltDependencies": ["esbuild"] } ``` Re-run `pnpm install` to apply. Only add packages whose source you trust. **Tests pass locally but break in CI.** Likely a difference in Node version or pnpm version between your laptop and the CI runner. Make CI use the same exact versions as `engines` and `packageManager` declare. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
