Re: [patch v3 06/12] lib: index message files with duplicate message-ids

2017-06-09 Thread David Bremner
David Bremner  writes:

> Daniel Kahn Gillmor  writes:

>> for example, i could follow up on the current message with another
>> message with Message-Id: 20170604123235.24466-7-da...@tethera.net and
>> give it a subject "Re: [patch v3 06/12] lib: do *not* index message
>> files with duplicate message-ids".  that's a bit odd, no?
>
> Yes, I agree that's a bit strange.  We should make some effort to
> display the subject that belongs with a given message body. I think it's
> not too hard [1] to preserve the old behaviour of keeping the first
> subject, date, and from. This leaves us with a version of the original
> hiding message attack, but only for the special case of regex searches,
> since those rely exclusively on the value slots.

I had a slightly radical idea for how to deal with that. Subject/from
from extra files could be appended to the value slot (e.g. separated by
newlines). Then regexp searches would behave similarly to term based
searches in that matching any file would match the message. We'd have to
be slightly careful about what anchors meant.  A further enhancement
would be to expose the search result as an array. This kind of approach
doesn't really make sense for dates, as we essentially search for those
as numbers, and such a hack would break the built-in xapian range
search.  
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [patch v3 06/12] lib: index message files with duplicate message-ids

2017-06-05 Thread David Bremner
Daniel Kahn Gillmor  writes:

> On Sun 2017-06-04 09:32:29 -0300, David Bremner wrote:
>> The corresponding xapian document just gets more terms added to it,
>> but this doesn't seem to break anything. Values on the other hand get
>> overwritten, which is a bit annoying, but arguably it is not worse to
>> take the values (from, subject, date) from the last file indexed
>> rather than the first.
[snip]
> for example, i could follow up on the current message with another
> message with Message-Id: 20170604123235.24466-7-da...@tethera.net and
> give it a subject "Re: [patch v3 06/12] lib: do *not* index message
> files with duplicate message-ids".  that's a bit odd, no?

Yes, I agree that's a bit strange.  We should make some effort to
display the subject that belongs with a given message body. I think it's
not too hard [1] to preserve the old behaviour of keeping the first
subject, date, and from. This leaves us with a version of the original
hiding message attack, but only for the special case of regex searches,
since those rely exclusively on the value slots.

[1]: should be just a matter of guarding the call to
_notmuch_message_set_header_values() with if (is_new || is_ghost), but
that needs testing.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [patch v3 06/12] lib: index message files with duplicate message-ids

2017-06-05 Thread Daniel Kahn Gillmor
On Sun 2017-06-04 09:32:29 -0300, David Bremner wrote:
> The corresponding xapian document just gets more terms added to it,
> but this doesn't seem to break anything. Values on the other hand get
> overwritten, which is a bit annoying, but arguably it is not worse to
> take the values (from, subject, date) from the last file indexed
> rather than the first.

It's certainly a change in behavior, though.  This suggests that i can
send you mail and have it change how an existing message shows up in the
summary view, for example.

for example, i could follow up on the current message with another
message with Message-Id: 20170604123235.24466-7-da...@tethera.net and
give it a subject "Re: [patch v3 06/12] lib: do *not* index message
files with duplicate message-ids".  that's a bit odd, no?

I'm not saying it's wrong, i'm just hoping to acknowledge that this
might be a controversial change.

this whole series is pretty elaborate, of course, but so far it's
looking like reasonable steps toward a clearer and more hackable
codebase.

  --dkg


signature.asc
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch