k-krawczyk commented on PR #1666:
URL: https://github.com/apache/camel-website/pull/1666#issuecomment-4730733995

   @davsclaus good call — I measured it against the live site rather than 
guessing:
   
   - ~5,570 doc pages in the sitemap (already spanning multiple versions: 
`next`, `4.18.x`, `4.14.x`, …)
   - average `.md` ~40–86 KB (component pages with big option tables are the 
large ones)
   - real zip ratio measured on a 12-page sample: **~26.5%**
   
   So uncompressed is ~250–480 MB (that's the ~500 MB you expected), and the 
**`.zip` lands around ~70–130 MB**. Plain `zip` compresses each file 
independently, so the cross-version duplication doesn't shrink it much — a 
solid `.tar.gz` would be smaller. If the on-disk build keeps more versions than 
the sitemap exposes, it'd scale up proportionally.
   
   So I agree it's too big to commit into `public/` and redeploy on every 
change. Options:
   1. Build it only on release (or a scheduled job) and publish it as a 
**GitHub Release asset**, with `llms.txt` pointing at that URL. The project 
already consumes release binaries via the `github-release-binary` yarn plugin, 
so this fits the existing distribution model.
   2. Ship `.tar.gz` instead of `.zip` to roughly halve the size.
   3. Split into smaller per-area bundles.
   
   I'm happy to rework this PR towards (1). Which distribution mechanism do you 
prefer?
   
   _Reported by Claude Code on behalf of Karol Krawczyk_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to