Nathan Hartman wrote on Tue, 24 Nov 2020 21:27 +00:00:
> On Tue, Nov 24, 2020 at 2:56 AM Daniel Sahlberg 
> <daniel.l.sahlb...@gmail.com> wrote:
> > Den tors 12 nov. 2020 kl 17:46 skrev Daniel Sahlberg 
> > <daniel.l.sahlb...@gmail.com>:
> >> Could ASF provide this server space (basically a VirtualHost)? The archive 
> >> is about 6.5 GB so it is not a huge amount.
> > 
> > Any thoughts on this?
> 
> I am looking into this; waiting for a reply...

In the circumstances — it's Nov 25 and the site says it'll be taken down
"in November 2020", not specifying a date — I'd say, better ask
forgiveness than permission.  Let's go ahead and grab all the data we
need to stand up the site (we have the mboxes, but not the mapping of
*.shtml files to message-id's, nor any of the HTML/CSS/images), and if
possible, also set it up (on svn-qavm.a.o or wherever) to ensure we've
got everything and to prepare for a DNS repointing, if Daniel agrees.
We can figure out the "paperwork", Puppet PRs, etc., later.

I'd say the highest priority is to save the mapping of .shtml URLs to
message-id's (which are available as comments in the source HTML),
whether via a recursive wget(1) invocation, or by asking Daniel to run
an appropriate grep, or however else.  Without that info, we won't be
able to preserve old URLs.

Maybe there's also a button we can press to sic the archive.org spider
on svn.haxx.se.

(We can't derive the message<->.shtml mapping from the mboxes we have.
I only grabbed mboxes through the transition to ASF; for anything after
that point, the order of .shtml files would be the order in which list
mails reached haxx.se's MX, and we have no backups of that info.)

Cheers,

Daniel

P.S.  Yes, it's a bit https://m.xkcd.com/2337/ of me to refer to both
      Daniel and Daniel as "Daniel". :)

Reply via email to