I was having trouble performing searches on the list's web archives using
Google, so I decided to import it all into my Gmail mailbox. It turned out
not to be entirely trivial, so I'm sharing my steps in case others find
them useful. This should work for importing any Pipermail archive into any
IMAP account.

I would have used J, but I'm still learning it ;-)

1. Make a new temporary directory and change into it.

mkdir j_mail
cd j_mail

2. Download all the archive files, with a delay between each call to avoid
being impolite—and getting your IP banned. Ignore the errors about
nonexistent dates, before October 2005 and after the current date. If
you're reading this far in the future, extend the years as needed. If you
only want to import the messages up to the date you joined the list, stop
at that date.

(this should be on one line, with no blanks after http:)

wget -nv -w5
http://jsoftware.com/pipermail/programming/20{05,06,07,08,09,10,11,12,13,14,15}-{January,February,March,April,May,June,July,August,September,October,November,December}.txt.gz

3. Decompress all the files. If you want to import the messages up to the
time you joined, open the last file with a text editor and delete the
messages you already have from the bottom of the file.

gunzip *.gz

4. Concatenate the files you're importing, with additional blank lines in
between, and fix some problematic headers.

(this should also be on one line)

perl -lpe 'if (/^From(:| \S+ at \S+ {2}\w{3} \w{3} [ \d]{2} \d\d:\d\d:\d\d
\d{4}$)?/) {if ($1) {if ($1 ne ":") {s/^/\n/} s/ at /@/g} else {s/^/ /}}'
*.txt > j.mbox

(It turns " at " back into @, inserts blank lines before "From ", and
alters bogus lines starting with "From " in the mail body.)

5. Get imap_upload.py and place it in the same directory (or in your path):

http://imap-upload.sourceforge.net/

6. If you are using Gmail, you need to temporarily enable "insecure sign-in
technology", so that imap_upload.py can actually connect to your account
using standard and perfectly secure IMAP over SSL. Don't ask me.

https://www.google.com/settings/security/lesssecureapps

7. Launch the upload, which may take a long time. The following options
will work for a Gmail account, placing the messages in the J folder / tag
(which must already exist) and saving any messages that couldn't get
through into err.mbox

python imap_upload.py --gmail --box J --error err.mbox j.mbox
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to