Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-18 Thread Shahrukh Merchant
Jeff, Thanks again--more things are clear now. But your response raises more questions in my mind as well. Please bear with me, we're almost there I think. Keeping in mind that I am splitting Digests into individual messages, I have to fake whatever headers are not already there within the

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-18 Thread Jeff Breidenbach
Yes, you can safely leave out To, Message-id, and Received. Consequences are what you'd expect, like the inability to do a message-id search and find that particular message. You are correct. Posting address is manually assigned during the bulk import process, and automatically determined from

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-17 Thread Jeff Breidenbach
The only things indexed for search are: message-id, subject, date (usually extracted from the Recieved: header), sender name (extracted from From: header), posting address (for example, gossip@mail-archive.com), archival message number, and message body. Every message is sorted and organized

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-14 Thread Shahrukh Merchant
Thanks Matt and Jeff for your answers--they were very helpful. So, as I understand it: - Sending link to mailman raw archives will take care of all posts from 2006 to present (for which the mailman archives exist). - Listserv digest format to individual email mbox format conversion (for the

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-14 Thread Earl Hood
On Mon, Apr 13, 2015 at 11:19 AM, Matt Morgan wrote: 2. Now the harder one. From Sep 1994 (inception) to Apr 2006, the lists were hosted using L-Soft's LISTSERV software, which did not keep archives. However, I have a complete set of all traffic from that time period, but they are all in

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-14 Thread Jeff Breidenbach
Statute of limitations is typically 3 kilomessages on a normal non-import list, but should (I think) be unlimited on bulk import. Conversion to unix newlines is required and is manual; doesn't matter who does it. Still prefer to do whole import at once especially if tricky; less labor, also less

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-14 Thread Shahrukh Merchant
On 4/14/2015 9:25 PM, Jeff Breidenbach wrote: * I recommend doing the import all at once, rather than in stages. Not for technical reasons, it just saves manual labor. OK, I may do it in 2 stages, since 1/2 the archives are in mbox format that can be imported instantly. The other half are in

Re: [Gossip] Porting digested new list archives to mail-archive

2015-04-13 Thread Matt Morgan
On 04/13/2015 12:57 AM, Shahrukh Merchant wrote: I have two discussion lists on the Argentine Tango that are probably going to be suspended going forward owing to lack of activity in the face of many competing technologies in recent years, but that have a treasure of information dating back