On Sunday 02 June 2002 10:00 am, LuKreme wrote: > On Saturday, June 1, 2002, at 09:21 PM, Scott Courtney wrote: > > On Saturday 01 June 2002 10:59 pm, LuKreme wrote: > >> Out of curiosity, how did you split the mbox? I have about 1200 emails > >> I > >> want to add to the archive. > > > > I wrote a little "awk" program to split them into 80-message chunks. Here > > is > > the source code: > > Ah.. awk. I hate awk. Good thing you wrote it. :)
I love awk! You probably would use <gasp!> <ugh!> .... Perl. (I mean that humorously, by the way. Not trying to start a flamewar or anything. Linux has lots of good tools, and it's great that each of us can choose the ones we like best. <GRIN>) [...] > > All the emails got loaded (thanks!) but I'm still getting errors when it's > trying to finish. > > ****** > Updating index files for archive [2002-June] > Date > Subject > Author > Thread > Computing threaded index > Updating HTML for article 52 > article file /Users/mailman/archives/private/list/2002-June/000052.html is > missing! My suggestion now is to do the following: 1. Fix up the "From " --> "rom " errors, since that is a known, obvious, and severe problem. You've probably already done that. 2. Read my later emails. I found a better way to deal with the archives at my end, namely by fixing the data so that "arch" doesn't fall out due to excessive errors. It appears that was the root cause of my problem -- bad input, and "arch" not having enough error diagnostics inside. Once I added some new error reports to "arch", I started getting answers that led me to the problem. 3. Consider using my *other* awk program, goodheaders.awk, to filter a copy of your data, then try the import as one single file. Steps for this: cd /users/mailman cp archives/private/mylist.mbox/mylist.mbox mylist.mbox.original ./goodheaders.awk < mylist.mbox.original > mylist.mbox.filtered cp mylist.mbox.filtered archives/private/mylist.mbox/mylist.mbox rm -r archives/private/mylist/* bin/arch mylist cron/nightly_gzip mylist Do these things with your qrunner and cron tasks temporarily halted. The "rm" command will zap all the old HTML files so you can rebuild from scratch. It also zaps the stateful information from previous runs. This worked quite well for me. I'm now mostly done transferring my lists, and the remainder is just mechanics, not troubleshooting. It was 0500 here and I was ready to get some sleep. ;-) Good luck! Scott -- -----------------------+------------------------------------------------------ Scott Courtney | "I don't mind Microsoft making money. I mind them [EMAIL PROTECTED] | having a bad operating system." -- Linus Torvalds http://www.4th.com/ | ("The Rebel Code," NY Times, 21 February 1999) ------------------------------------------------------ Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py