Re: [Gossip] Porting digested new list archives to mail-archive

Shahrukh Merchant Sat, 18 Apr 2015 10:16:49 -0700

Jeff,

Thanks again--more things are clear now. But your response raises morequestions in my mind as well. Please bear with me, we're almost there Ithink.

Keeping in mind that I am splitting Digests into individual messages, Ihave to fake whatever headers are not already there within theindividual messages. In the case of my digests, the existing headers areonly:


Date:
From:
Subject:

To this I add a ^From_ line just before the date line to make it an mboxformat (this is taken from the Digest header and copied in front of eachemail in the digest, so it's identical for several emails.


An example of this ^From_ line is this:
        From - Mon Nov 28 11:00:05 2005

Now, with this background, here are my further questions:

1. I thought I would have to add a "To:" line to my "faked" headers, butyou are saying that it is never used, is it OK if the To: line iscompletely absent?

The only things indexed for search are: message-id,

2. Will it choke if there is no Message-ID? I know this is used inthreading but in an earlier email you said that in the absence ofMessage-ID it would thread on Subject, but just want to make sure itwon't reject the email without a Message-ID field.

... subject  ... sender name (extracted from From: header)


No issue on these.

... date (usually extracted from the Recieved: header)

3. Hmm, again, there is no Received header. Will it properly take thedate from the Date: field in that case and not choke on the absence ofany Received headers? The date field format (from an actual example) is:

        Date:    Mon, 28 Nov 2005 17:55:14 +1100

... posting address (for example, [email protected] ... Every message is

> sorted and organized according to posting address ... but the To: header
> is never indexed for search, never used during import, and there is no
> benefit for you to adjust it.

4. This is where I'm most confused now. Where *is* the posting addressextracted from if not from the To: header? Is it an internal field inyour archive message database that is (a) predetermined manually in animport, (b) mapped to a fixed internally stored name for new incomingemail (based on headers including To:) and nothing else? In your earlierresponse, where I had asked about the varying forms of To: addresses inmy old archives that needed to be imported (e.g., [email protected],[email protected], [email protected]) in terms of confusingsearch (since I incorrectly imagined that the To: line would be lookedfor in the search), you had replied:

Search will have no concept of alternative list names. There is no reasonable 
way to overcome this.

but now you say search never looks at the To: lines and they aren't usedin imports either. So in light of your latest response I don'tunderstand now why search would have an issue of "alternative listnames"--they are alternative To: lines but the same list, and thevariations exist only in archives--new email would have a consistent To:line reflecting the current posting address.

A merged archive will have the same posting
address for every message, with no memory about what life was like
before the merge.

OK, this is consistent with the "To:" line never being used for search.So the "l=" parameter in the search would always have to be the new listname following a merge, correct? That shouldn't be an issue.


Shahrukh

_______________________________________________
Gossip mailing list
https://www.mail-archive.com/[email protected]
https://www.mail-archive.com/cgi-bin/mailman/options/gossip

Re: [Gossip] Porting digested new list archives to mail-archive

Reply via email to