He did already; see the preceding thread here on dev@.

You can figure the size that moves out of the repo from the docs sizes:

9.9M ./0.6.0
 10M ./0.6.1
 10M ./0.6.2
 15M ./0.7.0
 16M ./0.7.2
 16M ./0.7.3
 20M ./0.8.0
 20M ./0.8.1
 38M ./0.9.0
 38M ./0.9.1
 38M ./0.9.2
 36M ./1.0.0
 38M ./1.0.1
 38M ./1.0.2
 48M ./1.1.0
 48M ./1.1.1
 73M ./1.2.0
 73M ./1.2.1
 74M ./1.2.2
 69M ./1.3.0
 73M ./1.3.1
 68M ./1.4.0
 70M ./1.4.1
 80M ./1.5.0
 78M ./1.5.1
 78M ./1.5.2
 87M ./1.6.0
 87M ./1.6.1
 87M ./1.6.2
 86M ./1.6.3
117M ./2.0.0
119M ./2.0.0-preview
118M ./2.0.1
118M ./2.0.2
121M ./2.1.0
121M ./2.1.1
122M ./2.1.2
122M ./2.1.3
130M ./2.2.0
131M ./2.2.1
132M ./2.2.2
131M ./2.2.3
141M ./2.3.0
141M ./2.3.1
141M ./2.3.2
142M ./2.3.3
142M ./2.3.4
145M ./2.4.0
146M ./2.4.1
145M ./2.4.2
144M ./2.4.3
145M ./2.4.4
143M ./2.4.5
143M ./2.4.6
143M ./2.4.7
143M ./2.4.8
197M ./3.0.0
185M ./3.0.0-preview
197M ./3.0.0-preview2
198M ./3.0.1
198M ./3.0.2
205M ./3.0.3
239M ./3.1.1
239M ./3.1.2
239M ./3.1.3
840M ./3.2.0
842M ./3.2.1
282M ./3.2.2
244M ./3.2.3
282M ./3.2.4
295M ./3.3.0
297M ./3.3.1
297M ./3.3.2
297M ./3.3.3
297M ./3.3.4
314M ./3.4.0
314M ./3.4.1
328M ./3.4.2
324M ./3.4.3
1.1G ./3.5.0
1.2G ./3.5.1
1.1G ./4.0.0-preview1

On Mon, Aug 12, 2024 at 5:16 PM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi Kent,
>
> Can you if possible provide a heuristic estimate of space reduction your
> proposal is going to achieve?
>
> Thanks
>
> Mich Talebzadeh,
>
> Architect | Data Engineer | Data Science | Financial Crime
> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
> London <https://en.wikipedia.org/wiki/Imperial_College_London>
> London, United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 12 Aug 2024 at 14:55, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Hello,
>>
>> On the face of it, this email contains many references, making it
>> difficult to follow. Perhaps, a simpler explanation could improve voting
>> participation.
>>
>> The STAR methodology can be helpful in understanding and evaluating this
>> proposal. STAR stands for Situation, Task, Action, Result.
>>
>> Let us have a look at this
>>
>> *S*ituation:
>>
>>    - The Spark website repository is reaching its storage limit on
>>    GitHub-hosted runners.
>>
>> *T*ask:
>>
>>    - Reduce storage usage without compromising access to documentation.
>>
>> *A*ction:(proposed)
>>
>>    - Move documentation releases from the dev directory to the
>>    release directory within the Apache distribution.
>>    - Leverage the Apache Archives service to create permanent links for
>>    the documentation.
>>    - Upload older website-hosted documentation manually via SVN.
>>    - Optionally, delete old documentation and update links/use
>>    redirection as needed.
>>
>> *Result:*
>>
>>    - Reduced storage usage on GitHub-hosted runners.
>>    - Permanent, publicly accessible links for Spark documentation via
>>    the Apache Archives.
>>    - Potential need for manual upload of older documentation and link
>>    updates.
>>
>>
>> Consider including an estimated storage reduction achieved through this
>> approach.
>> Overall, the proposal offers a viable solution for managing Spark
>> documentation while reducing storage concerns. However, addressing the
>> potential complexity of managing older documentation versions is crucial.
>>
>> +1 for me
>>
>> Mich Talebzadeh,
>>
>> Architect | Data Engineer | Data Science | Financial Crime
>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>> London, United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Mon, 12 Aug 2024 at 10:09, Kent Yao <y...@apache.org> wrote:
>>
>>> Archive Spark Documentations in Apache Archives
>>>
>>> Hi dev,
>>>
>>> To address the issue of the Spark website repository size
>>> reaching the storage limit for GitHub-hosted runners [1], I suggest
>>> enhancing step [2] in our release process by relocating the
>>> documentation releases from the dev[3] directory to the release
>>> directory[4]. Then it would captured by the Apache Archives
>>> service[5] to create permanent links, which would be alternative
>>> endpoints for our documentation, like
>>>
>>>
>>> https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/_site/index.html
>>> for
>>> https://spark.apache.org/docs/3.5.2/index.html
>>>
>>> Note that the previous example still uses the staging repository,
>>> which will become
>>> https://archive.apache.org/dist/spark/docs/3.5.2/index.html.
>>>
>>> For older releases hosted on the Spark website [6], we also need to
>>> upload them via SVN manually.
>>>
>>> After that, when we reach the threshold again, we can delete some of
>>> the old ones on page [6], and update their links on page [7] or use
>>> redirection.
>>>
>>> JIRA ticket: https://issues.apache.org/jira/browse/SPARK-49209
>>>
>>> Please vote on the idea of  Archive Spark Documentations in
>>> Apache Archives for the next 72 hours:
>>>
>>> [ ] +1: Accept the proposal
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>> Bests,
>>> Kent Yao
>>>
>>> [1] https://lists.apache.org/thread/o0w4gqoks23xztdmjjj26jkp1yyg2bvq
>>> [2]
>>> https://spark.apache.org/release-process.html#upload-to-apache-release-directory
>>> [3] https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/
>>> [4] https://dist.apache.org/repos/dist/release/spark/docs/3.5.2
>>> [5] https://archive.apache.org/dist/spark/
>>> [6] https://github.com/apache/spark-website/tree/asf-site/site/docs
>>> [7] https://spark.apache.org/documentation.html
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

Reply via email to