notmuch-mutt: support for duplicate message removal

2012-08-02 Thread Stefano Zacchiroli
On Thu, Aug 02, 2012 at 11:03:44AM -0700, Jameson Graef Rollins wrote:
> > I didn't find the reference above but,
> http://mid.gmane.org/87k42vrqve.fsf at pip.fifthhorseman.net

Thanks. I confirm my previous comments after "but," then :-)

-- 
Stefano Zacchiroli zack@{upsilon.cc,pps.jussieu.fr,debian.org} . o .
Ma?tre de conf?rences   ..   http://upsilon.cc/zack   ..   . . o
Debian Project Leader...   @zack on identi.ca   ...o o o
? the first rule of tautology club is the first rule of tautology club ?


notmuch-mutt: support for duplicate message removal

2012-08-02 Thread Jameson Graef Rollins
On Thu, Aug 02 2012, Stefano Zacchiroli  wrote:
> On Wed, Aug 01, 2012 at 01:20:08PM -0400, Daniel Kahn Gillmor wrote:
>> The proposed feature could also exacerbate the previously-discussed
>> attack vector [0] whereby a malicious Message-ID collision can be used
>> to hide messages from the victim's mailstore.
>> 
>> [0] id:87k42vrqve.fsf at pip.fifthhorseman.net
>
> I didn't find the reference above but,

http://mid.gmane.org/87k42vrqve.fsf at pip.fifthhorseman.net

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



notmuch-mutt: support for duplicate message removal

2012-08-02 Thread Stefano Zacchiroli
On Wed, Aug 01, 2012 at 01:20:08PM -0400, Daniel Kahn Gillmor wrote:
> The proposed feature could also exacerbate the previously-discussed
> attack vector [0] whereby a malicious Message-ID collision can be used
> to hide messages from the victim's mailstore.
> 
> [0] id:87k42vrqve.fsf at pip.fifthhorseman.net

I didn't find the reference above but, if you're speaking about the
proposed patch only, I don't think it's the case. The proposed patch
only deduplicate file-identical (up to checksums, that is) messages in
maildirs: a Message-ID collision is not enough to hide a message.

But your comment is very interesting anyhow, as deduplicating on the
basis of Message-ID is indeed something I've discussed with Kevin as
future work. You just provided an extra argument not to enable that by
default.

Cheers.
-- 
Stefano Zacchiroli zack@{upsilon.cc,pps.jussieu.fr,debian.org} . o .
Ma?tre de conf?rences   ..   http://upsilon.cc/zack   ..   . . o
Debian Project Leader...   @zack on identi.ca   ...o o o
? the first rule of tautology club is the first rule of tautology club ?


Re: notmuch-mutt: support for duplicate message removal

2012-08-02 Thread Stefano Zacchiroli
On Wed, Aug 01, 2012 at 01:20:08PM -0400, Daniel Kahn Gillmor wrote:
 The proposed feature could also exacerbate the previously-discussed
 attack vector [0] whereby a malicious Message-ID collision can be used
 to hide messages from the victim's mailstore.
 
 [0] id:87k42vrqve@pip.fifthhorseman.net

I didn't find the reference above but, if you're speaking about the
proposed patch only, I don't think it's the case. The proposed patch
only deduplicate file-identical (up to checksums, that is) messages in
maildirs: a Message-ID collision is not enough to hide a message.

But your comment is very interesting anyhow, as deduplicating on the
basis of Message-ID is indeed something I've discussed with Kevin as
future work. You just provided an extra argument not to enable that by
default.

Cheers.
-- 
Stefano Zacchiroli zack@{upsilon.cc,pps.jussieu.fr,debian.org} . o .
Maître de conférences   ..   http://upsilon.cc/zack   ..   . . o
Debian Project Leader...   @zack on identi.ca   ...o o o
« the first rule of tautology club is the first rule of tautology club »
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-02 Thread Jameson Graef Rollins
On Thu, Aug 02 2012, Stefano Zacchiroli z...@upsilon.cc wrote:
 On Wed, Aug 01, 2012 at 01:20:08PM -0400, Daniel Kahn Gillmor wrote:
 The proposed feature could also exacerbate the previously-discussed
 attack vector [0] whereby a malicious Message-ID collision can be used
 to hide messages from the victim's mailstore.
 
 [0] id:87k42vrqve@pip.fifthhorseman.net

 I didn't find the reference above but,

http://mid.gmane.org/87k42vrqve@pip.fifthhorseman.net

jamie.


pgp9vx0oo5vJR.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-02 Thread Stefano Zacchiroli
On Thu, Aug 02, 2012 at 11:03:44AM -0700, Jameson Graef Rollins wrote:
  I didn't find the reference above but,
 http://mid.gmane.org/87k42vrqve@pip.fifthhorseman.net

Thanks. I confirm my previous comments after but, then :-)

-- 
Stefano Zacchiroli zack@{upsilon.cc,pps.jussieu.fr,debian.org} . o .
Maître de conférences   ..   http://upsilon.cc/zack   ..   . . o
Debian Project Leader...   @zack on identi.ca   ...o o o
« the first rule of tautology club is the first rule of tautology club »
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Jani Nikula
On Wed, 01 Aug 2012, "Kevin J. McCarthy"  wrote:
> Jani Nikula wrote:
>> I'm guessing you get the duplicates because you have dupes in the mail
>> store, and 'notmuch search --output=files' prints all the filenames
>> associated with each matching message, rather than any other reason. The
>> presented approach will only remove identical files, and will leave
>> behind files that are basically the same message, but have differing
>> headers, e.g. due to being received through different channels. Is this
>> what you want?
>
> This method was something we felt comfortable enabling by default.
>
> Stefano and I discussed adding a (by default off) option to remove
> duplicates by message-id, but wanted to get this patch merged first and
> then think about it.

Sounds reasonable, especially considering [1]. I have no comments on the
patches; I'm not a notmuch-mutt (or perl) user.

>> Perhaps an option to 'notmuch search --output=files' to print only one
>> filename (even if there are many) per message would be interesting.
>
> This would be useful for the second approach.  If it's easy to do, that
> would be great.

Apart from [1], the hardest part will be bikeshedding about the option
name. ;)

BR,
Jani.

[1] id:"87d33av2sg.fsf at nikula.org"


notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Jani Nikula
On Wed, 01 Aug 2012, Daniel Kahn Gillmor  wrote:
> On 08/01/2012 12:26 PM, Andrei POPESCU wrote:
>> I'm at least one user that cares enough about the distinction to have 
>> all list mails received via a different address, just to avoid Gmail's 
>> "feature" of silently dropping my own messages received via a list. 
>> IMVHO it should at least be configurable...
>
> The proposed feature could also exacerbate the previously-discussed
> attack vector [0] whereby a malicious Message-ID collision can be used
> to hide messages from the victim's mailstore.

Just to clarify, the feature proposed in this patch series does not make
the problem worse (as it would hide only fully identical messages, which
is not useful for the malicious purpose).

What I suggested [1] could indeed make notmuch-mutt as vulnerable to the
attack vector as notmuch show, and the emacs ui, currently are (but not
worse than that).

BR,
Jani.

[1] id:"87pq7aam8n.fsf at nikula.org"

>
>   --dkg
>
> [0] id:87k42vrqve.fsf at pip.fifthhorseman.net


notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Andrei POPESCU
On Mi, 01 aug 12, 13:24:24, Jani Nikula wrote:
> On Wed, 01 Aug 2012, Stefano Zacchiroli  wrote:
> >
> > Also, if you've further comments on the patch, do not hesitate!
> 
> I'm guessing you get the duplicates because you have dupes in the mail
> store, and 'notmuch search --output=files' prints all the filenames
> associated with each matching message, rather than any other reason. The
> presented approach will only remove identical files, and will leave
> behind files that are basically the same message, but have differing
> headers, e.g. due to being received through different channels. Is this
> what you want?

I'm at least one user that cares enough about the distinction to have 
all list mails received via a different address, just to avoid Gmail's 
"feature" of silently dropping my own messages received via a list. 
IMVHO it should at least be configurable...

Kind regards,
Andrei
-- 
If you can't explain it simply, you don't understand it well enough.
(Albert Einstein)
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: 



notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Daniel Kahn Gillmor
On 08/01/2012 12:26 PM, Andrei POPESCU wrote:
> I'm at least one user that cares enough about the distinction to have 
> all list mails received via a different address, just to avoid Gmail's 
> "feature" of silently dropping my own messages received via a list. 
> IMVHO it should at least be configurable...

The proposed feature could also exacerbate the previously-discussed
attack vector [0] whereby a malicious Message-ID collision can be used
to hide messages from the victim's mailstore.

--dkg

[0] id:87k42vrqve.fsf at pip.fifthhorseman.net

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1030 bytes
Desc: OpenPGP digital signature
URL: 



notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Kevin J. McCarthy
Jani Nikula wrote:
> I'm guessing you get the duplicates because you have dupes in the mail
> store, and 'notmuch search --output=files' prints all the filenames
> associated with each matching message, rather than any other reason. The
> presented approach will only remove identical files, and will leave
> behind files that are basically the same message, but have differing
> headers, e.g. due to being received through different channels. Is this
> what you want?

This method was something we felt comfortable enabling by default.

Stefano and I discussed adding a (by default off) option to remove
duplicates by message-id, but wanted to get this patch merged first and
then think about it.

> Perhaps an option to 'notmuch search --output=files' to print only one
> filename (even if there are many) per message would be interesting.

This would be useful for the second approach.  If it's easy to do, that
would be great.

-Kevin



notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Stefano Zacchiroli
Heya,
  here is a patchset originating from a feature contributed by Kevin
J. McCarthy: duplicate message removal for notmuch-mutt searches.

I've reviewed the main patch and gone through various iterations of it
with Kevin. I consider it suitable for application in its present
form, and I've added a subsequent patch to fix the Debian packaging
accordingly.

Can someone with commit access be so kind of applying this patchset to
the master branch?

Also, if you've further comments on the patch, do not hesitate!



notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Stefano Zacchiroli
Heya,
  here is a patchset originating from a feature contributed by Kevin
J. McCarthy: duplicate message removal for notmuch-mutt searches.

I've reviewed the main patch and gone through various iterations of it
with Kevin. I consider it suitable for application in its present
form, and I've added a subsequent patch to fix the Debian packaging
accordingly.

Can someone with commit access be so kind of applying this patchset to
the master branch?

Also, if you've further comments on the patch, do not hesitate!

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Jani Nikula
On Wed, 01 Aug 2012, Stefano Zacchiroli z...@upsilon.cc wrote:
 Heya,
   here is a patchset originating from a feature contributed by Kevin
 J. McCarthy: duplicate message removal for notmuch-mutt searches.

 I've reviewed the main patch and gone through various iterations of it
 with Kevin. I consider it suitable for application in its present
 form, and I've added a subsequent patch to fix the Debian packaging
 accordingly.

 Can someone with commit access be so kind of applying this patchset to
 the master branch?

 Also, if you've further comments on the patch, do not hesitate!

I'm guessing you get the duplicates because you have dupes in the mail
store, and 'notmuch search --output=files' prints all the filenames
associated with each matching message, rather than any other reason. The
presented approach will only remove identical files, and will leave
behind files that are basically the same message, but have differing
headers, e.g. due to being received through different channels. Is this
what you want?

Perhaps an option to 'notmuch search --output=files' to print only one
filename (even if there are many) per message would be interesting. IIRC
the first filename is used by 'notmuch show' to display the message
anyway. At a glance, this should be trivial to implement, but would it
cover your needs?


BR,
Jani.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Andrei POPESCU
On Mi, 01 aug 12, 13:24:24, Jani Nikula wrote:
 On Wed, 01 Aug 2012, Stefano Zacchiroli z...@upsilon.cc wrote:
 
  Also, if you've further comments on the patch, do not hesitate!
 
 I'm guessing you get the duplicates because you have dupes in the mail
 store, and 'notmuch search --output=files' prints all the filenames
 associated with each matching message, rather than any other reason. The
 presented approach will only remove identical files, and will leave
 behind files that are basically the same message, but have differing
 headers, e.g. due to being received through different channels. Is this
 what you want?

I'm at least one user that cares enough about the distinction to have 
all list mails received via a different address, just to avoid Gmail's 
feature of silently dropping my own messages received via a list. 
IMVHO it should at least be configurable...

Kind regards,
Andrei
-- 
If you can't explain it simply, you don't understand it well enough.
(Albert Einstein)


signature.asc
Description: Digital signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Daniel Kahn Gillmor
On 08/01/2012 12:26 PM, Andrei POPESCU wrote:
 I'm at least one user that cares enough about the distinction to have 
 all list mails received via a different address, just to avoid Gmail's 
 feature of silently dropping my own messages received via a list. 
 IMVHO it should at least be configurable...

The proposed feature could also exacerbate the previously-discussed
attack vector [0] whereby a malicious Message-ID collision can be used
to hide messages from the victim's mailstore.

--dkg

[0] id:87k42vrqve@pip.fifthhorseman.net



signature.asc
Description: OpenPGP digital signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Jani Nikula
On Wed, 01 Aug 2012, Daniel Kahn Gillmor d...@fifthhorseman.net wrote:
 On 08/01/2012 12:26 PM, Andrei POPESCU wrote:
 I'm at least one user that cares enough about the distinction to have 
 all list mails received via a different address, just to avoid Gmail's 
 feature of silently dropping my own messages received via a list. 
 IMVHO it should at least be configurable...

 The proposed feature could also exacerbate the previously-discussed
 attack vector [0] whereby a malicious Message-ID collision can be used
 to hide messages from the victim's mailstore.

Just to clarify, the feature proposed in this patch series does not make
the problem worse (as it would hide only fully identical messages, which
is not useful for the malicious purpose).

What I suggested [1] could indeed make notmuch-mutt as vulnerable to the
attack vector as notmuch show, and the emacs ui, currently are (but not
worse than that).

BR,
Jani.

[1] id:87pq7aam8n@nikula.org


   --dkg

 [0] id:87k42vrqve@pip.fifthhorseman.net
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Jani Nikula
On Wed, 01 Aug 2012, Kevin J. McCarthy ke...@8t8.us wrote:
 Jani Nikula wrote:
 I'm guessing you get the duplicates because you have dupes in the mail
 store, and 'notmuch search --output=files' prints all the filenames
 associated with each matching message, rather than any other reason. The
 presented approach will only remove identical files, and will leave
 behind files that are basically the same message, but have differing
 headers, e.g. due to being received through different channels. Is this
 what you want?

 This method was something we felt comfortable enabling by default.

 Stefano and I discussed adding a (by default off) option to remove
 duplicates by message-id, but wanted to get this patch merged first and
 then think about it.

Sounds reasonable, especially considering [1]. I have no comments on the
patches; I'm not a notmuch-mutt (or perl) user.

 Perhaps an option to 'notmuch search --output=files' to print only one
 filename (even if there are many) per message would be interesting.

 This would be useful for the second approach.  If it's easy to do, that
 would be great.

Apart from [1], the hardest part will be bikeshedding about the option
name. ;)

BR,
Jani.

[1] id:87d33av2sg@nikula.org
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-mutt: support for duplicate message removal

2012-08-01 Thread Kevin J. McCarthy
Jani Nikula wrote:
 I'm guessing you get the duplicates because you have dupes in the mail
 store, and 'notmuch search --output=files' prints all the filenames
 associated with each matching message, rather than any other reason. The
 presented approach will only remove identical files, and will leave
 behind files that are basically the same message, but have differing
 headers, e.g. due to being received through different channels. Is this
 what you want?

This method was something we felt comfortable enabling by default.

Stefano and I discussed adding a (by default off) option to remove
duplicates by message-id, but wanted to get this patch merged first and
then think about it.

 Perhaps an option to 'notmuch search --output=files' to print only one
 filename (even if there are many) per message would be interesting.

This would be useful for the second approach.  If it's easy to do, that
would be great.

-Kevin

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch