Den tis 22 dec. 2020 kl 02:08 skrev Greg Stein <[email protected]>: > On Mon, Dec 21, 2020 at 4:03 AM Daniel Shahaf <[email protected]> > wrote: > >> Daniel Sahlberg wrote on Mon, 21 Dec 2020 08:55 +0100: >> > Den fre 27 nov. 2020 kl 19:26 skrev Daniel Shahaf < >> [email protected]>: >> > >> > > Sounds good. Nathan, Daniel Sahlberg — could you work with Infra on >> > > getting the data over to ASF hardware? >> > >> > I have been given access to svn-qavm and uploaded a tarball of the >> website >> > (including mboxes). I'm a bit reluctant to unpack it since it takes >> almost >> > 7GB, and there is only 14 GB disk space remaining. Is it ok to unpack or >> > should we ask Infra for more disk space? >> >> I vote to ask for more disk space, especially considering that some >> percentage is reserved for uid=0's use. >> > > DSahlberg hit up Infra on #asfinfra on the-asf.slack.com, and asked for > more space. That's been provisioned now. >
I've unpacked in /home/dsahlberg/svnhaxx > >... > >> > The mboxes will be preserved but I don't plan to make them available for >> > download (since they are not available from lists.a.o or >> mail-archives.a.o). >> >> Please do make them available for download. Being able to download the >> raw data is useful for both backup and perusal purposes, and I doubt >> the bandwidth requirements would be a problem. (Might want >> a robots.txt entry, though?) >> > > Bandwidth should not be a problem for the mboxes, but yes: a robots.txt > would be nice. I think search engines spidering the static email pages > might be useful to the community, but the spiders really shouldn't need/use > the mboxes. > I'll figure out a way to have the mboxes downloadable. If I understand Google's documentation of robots.txt they don't care about robots.txt if a specific URL is linked from somewhere indexable, they will index it anyway. Maybe just make one big tarball of everything? > I think the first thing is to get httpd up and running with the desired > configuration. Then step two will be to memorialize that into puppet. Infra > can assist with the latter. I saw on Slack that Humbedooh gave you a link > to explore. > Since I havn't got root, I can't get any further to install httpd on my own. I couldn't figure out puppet, the links was 404 for me. I've created a request in Jira and I hope someone will take a look: https://issues.apache.org/jira/browse/INFRA-21230 Kind regards, Daniel

