Hello Mark, First, sorry about the 2 null replies. Somehow, letter "y" got mapped to "send message" which makes writing a message really tough.
On Sun, 17 Apr 2005 09:46:06 -0700 Mark Sapiro wrote: > David Relson wrote: > > > >I think I've encountered a gotcha with mm.arch. > > I'm confused. do you mean bin/arch or are you talking about some older > Mailman that I don't know about? My mistake. For convenience, I've got a symlink from bin/arch to mm.arch. > >In May 2004 I brought up mailman with list archives (from another > >program). All went well, AFAICT. A few days ago the listserver's > >hard drive crashed and I rebuilt the list archives from the monthly > >mbox files. I was very surprised to see that the newly created > >archives had zillions of messages in the 2004-May directories and > >nothing for prior months. > > > >Looking at the YYYY-Month.txt mbox files, I saw that all messages > >earlier than May 2004 had "Date: Mon May 3 hh:mm:dd 2004" lines, > >i.e. the Date: line shows when mm.arch is run. > > > Yes. It appears that when an archive is built with bin/arch, the Date: > header in the YYYY-Month.txt files is the date that bin/arch is run. > That surprised me too. It makes sense for mailman to record when it first sees a message, and Date: is the logical place for that. > >This seems wrong. From mm.arch's help message and the mbox file > >format, it seems that rebuilding with: > > > > cat mylist/*.txt > mylist.mbox/mylist.mbox > > mm.arch --wipe --quiet mylist mylist.mbox/mylist.mbox > > > >should produce the same archive as when you start. > > My question is why are you using the above process to create a global > mylist.mbox file instead of just using the existing one. The help message for arch suggests that arch --wipe mylist.mbox/mylist.mbox can be used to rebuild the mailing list archive. Since I was rebuilding the server and wasn't sure of the state of things, this seemed like the right thing to do. > If you built your archive initially by creating a > mylist.mbox/mylist.mbox file with your imported archives and then > running bin/arch. Then the archiver would continue to append new > messages to mylist.mbox/mylist.mbox and this will always be a complete > archive that can be used as input to bin/arch --wipe Complete, except that messages before day 1 (of the mailman era) will be put into directory "month 1" (of the mailman era). > >It seems that my ideas for rebuilding don't quite fit with reality. > >What am I overlooking? > > > Actually, I am surprised that what you did had this result. I thought > the bin/arch process used the date from the "From " line and not the > Date: header. I thought I remembered from my archive import struggles > that the date had to be correct in the "From " line. Using the date in the "From " line would have done what I want. I'd be happy with an option to enable that behavior. As that line is known as the "envelope", perhaps option "--envelope-date" or "--envelope" would be appropriate. Perhaps at one time, the envelope date _was_ used by arch and a "bug fix" changed the behavior. > >P.S. My solution to this problem was a perl script that extracts the > >date from the "^From " line and puts it in the Date: line. It's not a > >wonderful solution, but it works. (If I knew python better, I'd write > >fix.archive.dates.py to handle this situation). > > > That would probably be useful, and one to put the address and date from > the From: and Date: into the "From " would help with some archive > import situations. Indeed, those options would make for a useful utility. Regards, David ------------------------------------------------------ Mailman-Users mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp
