I'm trying to use Mhonarc to process a bunch of old
Listserv archives into Web-based format. Working backwards
through the years, I've had really good luck, but just
hit a stumbling block: early 1996 and earlier. From
1996-1999, the Listserv we used was set to archive weekly,
but prior to that it was monthly. I have no way of knowing
what else was changed at that time, but the weekly archives
are parsable, while the older ones take forever and lump
ALL the subjects into the SUBJECTNA variable. (If I'm only
doing a couple of thousand lines, I get the massive SUBJECTNA
line, but if I'm doing more, Mhonarc just takes over the machine
until I kill it.)
I'm pre-processing the archives with a SED script to include
message separators (and using the exact same script on the
weekly vs. monthly scripts), but it's choking somewhere.
Here's the beginning of one of the problematic archives
<START PROBLEM>
[ejray@www TEMP01124535]$ more TECHWR-L.LOG9512
>From ???@??? Sun Jan 00 00:00:00 0000
Date: Fri, 1 Dec 1995 16:07:00 +1100
Reply-To: "Colleen Dancer (02) 333-1862" <[EMAIL PROTECTED]>
Sender: "Technical Writers List; for all Technical Communication issues"
<[EMAIL PROTECTED]>
From: "Colleen Dancer (02) 333-1862" <[EMAIL PROTECTED]>
Subject: Re: "Proper use of commas in England?"
I agree that in Australia like England we don't use the serial comma
unless necessary to remove ambiguity. However what will annoy your
audience far far more is the use of American spelling. I know I detest
it in manuals that I buy in Australia. I feel that if the product is
going to be sold in Australia / England they can use the Queen's
English. I would suggest that you can probably use your discretion for
the comma, but DEFINITELY use the correct spelling for the audience.
</START PROBLEM>
<START FUNCTIONAL>
[ejray@www TEMP30053907]$ more *B
>From ???@??? Sun Jan 00 00:00:00 0000
Date: Sat, 13 Jan 1996 18:59:51 -0500
Reply-To: [EMAIL PROTECTED]
Sender: "Technical Writers List; for all Technical Communication issues"
<[EMAIL PROTECTED]>
From: [EMAIL PROTECTED]
Subject: Tip of the Day
Tiffany Haley asked whether the Microsoft Tip of the Day concept is
copyrighted. I believe it's part of the new Microsoft Office "look and feel"
that Microsoft is trying to promote throughout Windows products. Delrina's
new WinFax Pro for Windows 95 utilizes the Tip of the Day just like
Microsoft's suite.
--George Hayhoe ([EMAIL PROTECTED])
>From ???@??? Sun Jan 00 00:00:00 0000
Date: Sat, 13 Jan 1996 18:59:55 -0500
Reply-To: [EMAIL PROTECTED]
Sender: "Technical Writers List; for all Technical Communication issues"
<[EMAIL PROTECTED]>
From: [EMAIL PROTECTED]
Subject: Alan Cooper's _About_Face_
</START FUNCTIONAL>
As far as I can tell, they're identical in the significant ways. The Sender
is a .bitnet address in the non-functional ones, but ...
Any suggestions for troubleshooting this?
Version info:
MHonArc v2.4.5 (Perl 5.00503)
Linux www.raycomm.com 2.2.13-7mdk #1 Wed Sep 15 18:02:18 CEST 1999 i586 unknown
Any help would be appreciated!
Eric
[EMAIL PROTECTED]