Louis Proyect <[EMAIL PROTECTED]> writes:

> Has anybody written a perl script to convert mhonarc msg html to
> standard Internet RSC mailbox format?  I want to add old archives to
> the mail-archive website, but neglected to save the mailbox data that
> created them originally.

I made the following script for one particular case, but since MHonArc
is incredibly configurable, there is little chance for the script to
work generally.  But it might help you at getting started, who knows...

To use it, I called a recursive `wget' on the archives, and from within
the directory, did `unmhonarc * > ../FOLDER' to produce a single big FOLDER
containing all the archives.  Then, I digested that folder from within Gnus,
and had fun for a good while, sorting out all the information!

The following script is put in an executable file named `unmhonarc',
as you guessed already :-).


#!/usr/bin/env python
# Rebuild simple messages from their HTML expression.

import string, sys

def main(*arguments):
    for file in arguments:
        sys.stderr.write("Processing %s ...\n" % file)
        lines = open(file).readlines()
        sys.stdout.write('From nobody@nowhere  Sun Feb 13 06:46:37 2000\n')
        for counter in range(len(lines)):
            if lines[counter][0:4] == '<li>':
                break
        write_clean(lines[counter][4:])
        counter = counter + 1
        write_clean(lines[counter][4:])
        counter = counter + 1
        write_clean(lines[counter][4:])
        counter = counter + 1
        sys.stdout.write('Message-Id: <[EMAIL PROTECTED]>\n' % file)
        sys.stdout.write('\n')
        while counter < len(lines):
            if lines[counter] == '<PRE>\n':
                break
            counter = counter + 1
        counter = counter + 1
        while counter < len(lines):
            if lines[counter] == '</PRE>\n':
                break
            write_clean(lines[counter])
            counter = counter + 1
        sys.stdout.write('\n')
        sys.stdout.write('\n')

def write_clean(line):
    line = string.replace(line, '&lt;', '<')
    line = string.replace(line, '&gt;', '>')
    line = string.replace(line, '&amp;', '&')
    sys.stdout.write(line)

if __name__ == '__main__':
    apply(main, tuple(sys.argv[1:]))

-- 
Fran�ois Pinard   http://www.iro.umontreal.ca/~pinard


Reply via email to