He did already; see the preceding thread here on dev@. You can figure the size that moves out of the repo from the docs sizes:
9.9M ./0.6.0 10M ./0.6.1 10M ./0.6.2 15M ./0.7.0 16M ./0.7.2 16M ./0.7.3 20M ./0.8.0 20M ./0.8.1 38M ./0.9.0 38M ./0.9.1 38M ./0.9.2 36M ./1.0.0 38M ./1.0.1 38M ./1.0.2 48M ./1.1.0 48M ./1.1.1 73M ./1.2.0 73M ./1.2.1 74M ./1.2.2 69M ./1.3.0 73M ./1.3.1 68M ./1.4.0 70M ./1.4.1 80M ./1.5.0 78M ./1.5.1 78M ./1.5.2 87M ./1.6.0 87M ./1.6.1 87M ./1.6.2 86M ./1.6.3 117M ./2.0.0 119M ./2.0.0-preview 118M ./2.0.1 118M ./2.0.2 121M ./2.1.0 121M ./2.1.1 122M ./2.1.2 122M ./2.1.3 130M ./2.2.0 131M ./2.2.1 132M ./2.2.2 131M ./2.2.3 141M ./2.3.0 141M ./2.3.1 141M ./2.3.2 142M ./2.3.3 142M ./2.3.4 145M ./2.4.0 146M ./2.4.1 145M ./2.4.2 144M ./2.4.3 145M ./2.4.4 143M ./2.4.5 143M ./2.4.6 143M ./2.4.7 143M ./2.4.8 197M ./3.0.0 185M ./3.0.0-preview 197M ./3.0.0-preview2 198M ./3.0.1 198M ./3.0.2 205M ./3.0.3 239M ./3.1.1 239M ./3.1.2 239M ./3.1.3 840M ./3.2.0 842M ./3.2.1 282M ./3.2.2 244M ./3.2.3 282M ./3.2.4 295M ./3.3.0 297M ./3.3.1 297M ./3.3.2 297M ./3.3.3 297M ./3.3.4 314M ./3.4.0 314M ./3.4.1 328M ./3.4.2 324M ./3.4.3 1.1G ./3.5.0 1.2G ./3.5.1 1.1G ./4.0.0-preview1 On Mon, Aug 12, 2024 at 5:16 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Kent, > > Can you if possible provide a heuristic estimate of space reduction your > proposal is going to achieve? > > Thanks > > Mich Talebzadeh, > > Architect | Data Engineer | Data Science | Financial Crime > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College > London <https://en.wikipedia.org/wiki/Imperial_College_London> > London, United Kingdom > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* The information provided is correct to the best of my > knowledge but of course cannot be guaranteed . It is essential to note > that, as with any advice, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Mon, 12 Aug 2024 at 14:55, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Hello, >> >> On the face of it, this email contains many references, making it >> difficult to follow. Perhaps, a simpler explanation could improve voting >> participation. >> >> The STAR methodology can be helpful in understanding and evaluating this >> proposal. STAR stands for Situation, Task, Action, Result. >> >> Let us have a look at this >> >> *S*ituation: >> >> - The Spark website repository is reaching its storage limit on >> GitHub-hosted runners. >> >> *T*ask: >> >> - Reduce storage usage without compromising access to documentation. >> >> *A*ction:(proposed) >> >> - Move documentation releases from the dev directory to the >> release directory within the Apache distribution. >> - Leverage the Apache Archives service to create permanent links for >> the documentation. >> - Upload older website-hosted documentation manually via SVN. >> - Optionally, delete old documentation and update links/use >> redirection as needed. >> >> *Result:* >> >> - Reduced storage usage on GitHub-hosted runners. >> - Permanent, publicly accessible links for Spark documentation via >> the Apache Archives. >> - Potential need for manual upload of older documentation and link >> updates. >> >> >> Consider including an estimated storage reduction achieved through this >> approach. >> Overall, the proposal offers a viable solution for managing Spark >> documentation while reducing storage concerns. However, addressing the >> potential complexity of managing older documentation versions is crucial. >> >> +1 for me >> >> Mich Talebzadeh, >> >> Architect | Data Engineer | Data Science | Financial Crime >> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >> College London <https://en.wikipedia.org/wiki/Imperial_College_London> >> London, United Kingdom >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with any advice, quote "one test result is worth one-thousand >> expert opinions (Werner >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >> >> >> On Mon, 12 Aug 2024 at 10:09, Kent Yao <y...@apache.org> wrote: >> >>> Archive Spark Documentations in Apache Archives >>> >>> Hi dev, >>> >>> To address the issue of the Spark website repository size >>> reaching the storage limit for GitHub-hosted runners [1], I suggest >>> enhancing step [2] in our release process by relocating the >>> documentation releases from the dev[3] directory to the release >>> directory[4]. Then it would captured by the Apache Archives >>> service[5] to create permanent links, which would be alternative >>> endpoints for our documentation, like >>> >>> >>> https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/_site/index.html >>> for >>> https://spark.apache.org/docs/3.5.2/index.html >>> >>> Note that the previous example still uses the staging repository, >>> which will become >>> https://archive.apache.org/dist/spark/docs/3.5.2/index.html. >>> >>> For older releases hosted on the Spark website [6], we also need to >>> upload them via SVN manually. >>> >>> After that, when we reach the threshold again, we can delete some of >>> the old ones on page [6], and update their links on page [7] or use >>> redirection. >>> >>> JIRA ticket: https://issues.apache.org/jira/browse/SPARK-49209 >>> >>> Please vote on the idea of Archive Spark Documentations in >>> Apache Archives for the next 72 hours: >>> >>> [ ] +1: Accept the proposal >>> [ ] +0 >>> [ ] -1: I don’t think this is a good idea because … >>> >>> Bests, >>> Kent Yao >>> >>> [1] https://lists.apache.org/thread/o0w4gqoks23xztdmjjj26jkp1yyg2bvq >>> [2] >>> https://spark.apache.org/release-process.html#upload-to-apache-release-directory >>> [3] https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/ >>> [4] https://dist.apache.org/repos/dist/release/spark/docs/3.5.2 >>> [5] https://archive.apache.org/dist/spark/ >>> [6] https://github.com/apache/spark-website/tree/asf-site/site/docs >>> [7] https://spark.apache.org/documentation.html >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>>