moonming opened a new pull request, #2055: URL: https://github.com/apache/apisix-website/pull/2055
## Summary Removes long-standing sitemap bloat from outdated sub-project documentation. The EN sitemap currently carries ~800 sub-project doc URLs — including ancient versions (ingress-controller `0.4.0`–`2.0.0`, docker `apisix-2.10.x`) and ~80 thin `/tags/` pages — while the main APISIX docs are clean. Two root causes: 1. **`scripts/sync-docs.js`** built *every* release of each sub-project. apisix is curated via `config/apisix-versions.js`; the sub-projects weren't. This change keeps only the **newest** released version of each (`SUBPROJECT_VERSIONS_TO_KEEP`, default `1`). The latest version is served unversioned at `/docs/<project>/` and indexed; `next` remains robots-disallowed. Older versions stay available in each project's source repo (git tags). 2. **`scripts/update-sitemap-loc.js`** — the version-exclusion regex only matched 2-part versions (`/docs/apisix/3.14/`). It missed 3-part semver (`/docs/ingress-controller/2.0.0/`) and prefixed versions (`/docs/docker/apisix-2.10.0/`) — exactly why sub-project versioned docs leaked into the sitemap. Broadened to cover all three forms. ## ⚠️ Behavior change — please confirm This **removes older sub-project doc versions from the published site** (they remain in each source repo via git tags). Intentional per discussion, but a content-availability change maintainers should sign off on. `SUBPROJECT_VERSIONS_TO_KEEP` can be raised to keep a wider window — the sitemap regex now handles >1 correctly. ## Test plan - [ ] CI build passes (`yarn build`), including the doc sync step. - [ ] After build, each sub-project has a single versioned tree + `next`; `/docs/ingress-controller/`, `/docs/docker/`, etc. serve the latest version. - [ ] `website` / `doc` / `blog` `sitemap.xml` no longer contains `/docs/<project>/<old-version>/` URLs (the "Filtered out N URLs" log should rise). - [ ] Spot-check: `/docs/ingress-controller/2.0.0/...` and `/docs/docker/apisix-2.10.0/...` are absent from the sitemap. ## Verification done locally The regex was validated against real URL shapes (2-part, 3-part, docker-prefixed → excluded; latest/unversioned and non-version doc paths → kept). The version slice and both scripts' syntax were checked. Full doc-sync + build verification is delegated to CI (it clones the sub-project repos). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
