On 03/08/2012 12:04 PM, James Vasile wrote:
On Thu, 08 Mar 2012 11:37:09 -0500, Daniel Kahn Gillmor<d...@fifthhorseman.net>
wrote:
Any ideas on how to approach this?
Treat messages with the same ID but different hashes as different?
Given that a message hash would include all headers, including Received:
and other MTA-added stuff, i think that would remove all relevance of
the Message-ID field. in particular, it seems like we would just be
identifying messages by their digest.
If you're willing to ignore the headers and just look at a digest of the
body, that still doesn't provide any help for the common (legitimate)
case of a message jointly-delivered to a mailing list and to a specific
(already-subscribed) user.
That user will get two copies of the message, and since most mailing
lists modify the body of the message (usually by adding a footer section
with mailing list info) their bodies will also have different digests.
So i don't see how to make this suggestion work without giving up on
Message-IDs as the identifier entirely (and therefore accepting many
more spurious duplicates than users currently need to tolerate).
Any other suggestions or ideas?
--dkg
_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch