Re: svn.haxx.se is going away

Daniel Sahlberg Thu, 24 Dec 2020 11:38:42 -0800

Den tis 22 dec. 2020 kl 02:08 skrev Greg Stein <[email protected]>:

> On Mon, Dec 21, 2020 at 4:03 AM Daniel Shahaf <[email protected]>
> wrote:
>
>> Daniel Sahlberg wrote on Mon, 21 Dec 2020 08:55 +0100:
>> > Den fre 27 nov. 2020 kl 19:26 skrev Daniel Shahaf <
>> [email protected]>:
>> >
>> > > Sounds good.  Nathan, Daniel Sahlberg — could you work with Infra on
>> > > getting the data over to ASF hardware?
>> >
>> > I have been given access to svn-qavm and uploaded a tarball of the
>> website
>> > (including mboxes). I'm a bit reluctant to unpack it since it takes
>> almost
>> > 7GB, and there is only 14 GB disk space remaining. Is it ok to unpack or
>> > should we ask Infra for more disk space?
>>
>> I vote to ask for more disk space, especially considering that some
>> percentage is reserved for uid=0's use.
>>
>
> DSahlberg hit up Infra on #asfinfra on the-asf.slack.com, and asked for
> more space. That's been provisioned now.
>


I've unpacked in /home/dsahlberg/svnhaxx


> >...
>
>> > The mboxes will be preserved but I don't plan to make them available for
>> > download (since they are not available from lists.a.o or
>> mail-archives.a.o).
>>
>> Please do make them available for download.  Being able to download the
>> raw data is useful for both backup and perusal purposes, and I doubt
>> the bandwidth requirements would be a problem.  (Might want
>> a robots.txt entry, though?)
>>
>
> Bandwidth should not be a problem for the mboxes, but yes: a robots.txt
> would be nice. I think search engines spidering the static email pages
> might be useful to the community, but the spiders really shouldn't need/use
> the mboxes.
>

I'll figure out a way to have the mboxes downloadable. If I understand
Google's documentation of robots.txt they don't care about robots.txt if a
specific URL is linked from somewhere indexable, they will index it anyway.
Maybe just make one big tarball of everything?


> I think the first thing is to get httpd up and running with the desired
> configuration. Then step two will be to memorialize that into puppet. Infra
> can assist with the latter. I saw on Slack that Humbedooh gave you a link
> to explore.
>

Since I havn't got root, I can't get any further to install httpd on my own.
I couldn't figure out puppet, the links was 404 for me. I've created a
request in Jira and I hope someone will take a look:
https://issues.apache.org/jira/browse/INFRA-21230

Kind regards,
Daniel

Re: svn.haxx.se is going away

Reply via email to