I have two discussion lists on the Argentine Tango that are probably going to be suspended going forward owing to lack of activity in the face of many competing technologies in recent years, but that have a treasure of information dating back from 1994. I would like to get these onto mail-archive but there are some peculiarities of the existing archives that I have some questions on.

Here are the questions:

1. First the easy one. From April 2006 to the present, the lists were hosted using mailman, so I have the complete raw mailman archives that I've downloaded. They are in one big mbox-format file (about 50 MB). (a) This is I suppose the most straightforward since I just send a pointer to these files and the mail-archive staff will do the rest, correct? (b) And am I correct that the single file is the best (rather than the monthly gzipped files? And (c) that the mail-archive software will recreate threads as necessary?

2. Now the harder one. From Sep 1994 (inception) to Apr 2006, the lists were hosted using L-Soft's LISTSERV software, which did not keep archives. However, I have a complete set of all traffic from that time period, but they are all in Daily Digest format, i.e., with a "Table of Contents" in the front and several emails afterwards. I have MOST (but not all) of these available as MIME digests with each message in a different MIME multipart segment. I also have ALL of them available as a non-MIME digest, with a fixed text separator (like a row of ----) between messages. I would propose to send these as an mbox format of digest files but each email in each digest message would still need to be separated out. (a) Can mail-archive do this digest parsing, or do I need to find or write a script to do this myself? (b) If mail-archive can do it, do you have a preference for MIME vs. non-MIME digest? (c) And if MIME, can you handle the few for which I only have non-MIME digests?

3. Must these old archives be processed by mail-archive in chronological order in order for threading to work properly? Or if I provide older ones later are they automatically inserted and rethreaded appropriately?

4. The FAQ says that only the latest 3000 messages are kept live and the rest are in "cold storage" and can be retrieved only via matching searches. Some questions on this: (a) Are the "latest" based on when they were processed by the archive software (e.g., old archives processed recently would count as new)? Or (b) Are the "latest" based on the Date: field of the post in question? (c) Is there any way to get ALL messages live on mail-archive rather than only 3000 so they can be browsed for by month and year for example (e.g., by requesting an exception considering the list will be mothballed and won't be expanding, or by paying a donation/fee)? There is about 100 MB total of data per list, I'd guess. (d) If not, is there a way I can get a full mirror download that include the "cold storage" older archives (after processing by mail-archive's scripts) for me to install live on my own server (which may or may not disappear) while mail-archive still keeps it more permanently in their live+cold way?

OK, I think that's it for now.

Regards,

Shahrukh

_______________________________________________
Gossip mailing list
https://www.mail-archive.com/gossip@mail-archive.com
https://www.mail-archive.com/cgi-bin/mailman/options/gossip

Reply via email to