Jon Lang schrieb:
Please do.  Anything that makes the list archives more accessible is a
good thing IMHO.

Hello dataweaver :)

I mananged to recreate the mbox format ... at least partly:

About recreating the mailbox format for the old digest files:

There are additional seperator lines (empty and filled with "-" signs) and a header and a footer part from the digest - those things can be changed easily:

grep -v '^------------------------------$' <INPUTFILE> | awk '/^In this issue:/,/^or GURPSnet-Digest mailing lists./{next}/^End of GURPSnet-Digest/,/in the commands above with/{next}/^$/{D=1}/^----------------------------------------------------------------------$/{D=2}/^$/{D=0; next}{if (D == 0) print}' > <OUTPUTFILE>


But the "From" header is missing ...

This could be recreated from the "From:" line and the "Date:" line, but both data is not in the right format :-(

The "From" lines requires the *pure* mail address, without any full names or such things and also a date-string, that is in a different format then the version from the "Date:" line :-/

Also, the "Date:" line contains sometime not all required information - the name of the weekday can be missing. This data cannot be restored correctly, without consulting a calendar application first.
Too much work - so, i tried to just use "Mon" for all those mails.

So, the very crude solution:

egrep -v '(^------------------------------$|^GURPSnet-Digest)' <INPUT> awk '/^Date: /{DATUM=$0; if (DATUM ~ /,/) {DATE=$0;gsub(",","",DATE);split(DATE,DATES);DATE=DATES[2]" "DATES[4]" "DATES[3]" "DATES[6]" "DATES[5]" "DATES[7]" "DATES[8]" "DATES[9]} else {DATE=$0;split(DATE,DATES);DATE=DATES[3]" "DATES[2]" "DATES[5]" "DATES[4]" "DATES[6]" "DATES[7]" "DATES[8]" "DATES[9]}}/^From: /{FROM=$0; NEW_FROM=$0; gsub("From:","From", NEW_FROM); print NEW_FROM, DATE; print DATUM}/^In this issue:/,/^or GURPSnet-Digest mailing lists./{next}/^End of GURPSnet-Digest/,/in the commands above with/{next}/^$/{D=1}/^----------------------------------------------------------------------$/{D=2}/^$/{D=0; next}{if (D == 0 && $0 !~ /^Date:/) print}' | sed "s/^From \(.*\) \(....@.*\)/From \2 /g" > <OUTPUT>

Together:

#! /bin/bash

for y in * ; do
(
        cd $y && for d in * ; do
                cp -p ${d} tmp
egrep -v '(^------------------------------$|^GURPSnet-Digest)' tmp | awk '/^Date: /{DATUM=$0; if (DATUM ~ /,/) {DATE=$0;gsub(",","",DATE);split(DATE,DATES);DATE=DATES[2]" "DATES[4]" "DATES[3]" "DATES[6]" "DATES[5]" "DATES[7]" "DATES[8]" "DATES[9]} else {DATE=$0;split(DATE,DATES);DATE=DATES[3]" "DATES[2]" "DATES[5]" "DATES[4]" "DATES[6]" "DATES[7]" "DATES[8]" "DATES[9]}}/^From: /{FROM=$0; NEW_FROM=$0; gsub("From:","From", NEW_FROM); print NEW_FROM, DATE; print DATUM}/^In this issue:/,/^or GURPSnet-Digest mailing lists./{next}/^End of GURPSnet-Digest/,/in the commands above with/{next}/^$/{D=1}/^----------------------------------------------------------------------$/{D=2}/^$/{D=0; next}{if (D == 0 && $0 !~ /^Date:/) print}' | sed "s/^From \(.*\) \(....@.*\)/From \2 /g" > ${d}
                rm tmp
        done

)
done


Ugly, but it seemed to work!

120697 mails in the old archives (till 2003, 219MB), and
20574 mails in the newer archives (from 2003, 36MB).

If anybody has interest in those complete archives in mailbox format:

http://www.tja-server.de/GURPSnet/OLD_till-2003-03-01.bz2 (8MB bzipped)
http://www.tja-server.de/GURPSnet/NEW_till_2010-01-03.bz2 (51MB bzipped)

I will let that open for some time, please download :)


I'd also be interested in any information about hacking the MailMan
list archives so as to "back edit" older messages into it.  Of course,
to do that I'll probably have to move the list onto gurpsnet.org so
that I _can_ hack it.

What exactly do you mean with "back edit"?
Is that archive not complete? Or what is the reason for this?


I'm also interested in help with setting up a private nntp server; I'd
like to set one up at gurpsnet.org with the intent of synchronizing it
to the mailing list.

Never did that so far ...
Maybe i can find some information.

Greetings :)
_______________________________________________
GurpsNet-L mailing list <[email protected]>
http://mail.sjgames.com/mailman/listinfo/gurpsnet-l

Reply via email to