I have two discussion lists on the Argentine Tango that are probably
going to be suspended going forward owing to lack of activity in the
face of many competing technologies in recent years, but that have a
treasure of information dating back from 1994. I would like to get these
onto mail-archive but there are some peculiarities of the existing
archives that I have some questions on.
Here are the questions:
1. First the easy one. From April 2006 to the present, the lists were
hosted using mailman, so I have the complete raw mailman archives that
I've downloaded. They are in one big mbox-format file (about 50 MB). (a)
This is I suppose the most straightforward since I just send a pointer
to these files and the mail-archive staff will do the rest, correct? (b)
And am I correct that the single file is the best (rather than the
monthly gzipped files? And (c) that the mail-archive software will
recreate threads as necessary?
2. Now the harder one. From Sep 1994 (inception) to Apr 2006, the lists
were hosted using L-Soft's LISTSERV software, which did not keep
archives. However, I have a complete set of all traffic from that time
period, but they are all in Daily Digest format, i.e., with a "Table of
Contents" in the front and several emails afterwards. I have MOST (but
not all) of these available as MIME digests with each message in a
different MIME multipart segment. I also have ALL of them available as a
non-MIME digest, with a fixed text separator (like a row of ----)
between messages. I would propose to send these as an mbox format of
digest files but each email in each digest message would still need to
be separated out. (a) Can mail-archive do this digest parsing, or do I
need to find or write a script to do this myself? (b) If mail-archive
can do it, do you have a preference for MIME vs. non-MIME digest? (c)
And if MIME, can you handle the few for which I only have non-MIME digests?
3. Must these old archives be processed by mail-archive in chronological
order in order for threading to work properly? Or if I provide older
ones later are they automatically inserted and rethreaded appropriately?
4. The FAQ says that only the latest 3000 messages are kept live and the
rest are in "cold storage" and can be retrieved only via matching
searches. Some questions on this: (a) Are the "latest" based on when
they were processed by the archive software (e.g., old archives
processed recently would count as new)? Or (b) Are the "latest" based on
the Date: field of the post in question? (c) Is there any way to get ALL
messages live on mail-archive rather than only 3000 so they can be
browsed for by month and year for example (e.g., by requesting an
exception considering the list will be mothballed and won't be
expanding, or by paying a donation/fee)? There is about 100 MB total of
data per list, I'd guess. (d) If not, is there a way I can get a full
mirror download that include the "cold storage" older archives (after
processing by mail-archive's scripts) for me to install live on my own
server (which may or may not disappear) while mail-archive still keeps
it more permanently in their live+cold way?
OK, I think that's it for now.
Regards,
Shahrukh
_______________________________________________
Gossip mailing list
https://www.mail-archive.com/gossip@mail-archive.com
https://www.mail-archive.com/cgi-bin/mailman/options/gossip