[PATCH 1/5] test: two new messages for the 'broken' corpus

2018-04-13 Thread David Bremner
These have an 'In-Reply-To' loop, which currently confuses "notmuch
new".

Courtesy of anarcat.
---
 ...521463752.R13151765805797588408.curie:2,FS | 173 +
 ...1521463753.R9368947314807690338.curie:2,FS | 180 ++
 2 files changed, 353 insertions(+)
 create mode 100644 
test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
 create mode 100644 
test/corpora/broken/gitlab/cur/1521463753.R9368947314807690338.curie:2,FS

diff --git 
a/test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS 
b/test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
new file mode 100644
index ..0dc4ecf8
--- /dev/null
+++ b/test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
@@ -0,0 +1,173 @@
+Return-Path: 
+X-Original-To: anarcat+git...@anarc.at
+Delivered-To: anarcat+git...@anarc.at
+Received: from marcos.anarc.at (localhost [127.0.0.1])
+   by delivery.anarc.at (Postfix) with ESMTP id 99EA510E050
+   for ; Tue, 13 Mar 2018 20:30:59 -0400 (EDT)
+X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on marcos.anarc.at
+X-Spam-Level: 
+X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,
+   DKIM_VALID,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,RCVD_IN_DNSWL_MED
+   autolearn=ham autolearn_force=no version=3.4.1
+Received: from deferred1.pod6.iad1.zdsys.com (deferred1.pod6.iad1.zdsys.com 
[192.161.153.116])
+   by mx.anarc.at (Postfix) with ESMTPS id 83B8F10E04F
+   for ; Tue, 13 Mar 2018 20:30:59 -0400 (EDT)
+Received: from out4.pod6.iad1.zdsys.com (out4.pod6.iad1.zdsys.com 
[192.161.153.103])
+   (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
+   (No client certificate requested)
+   by deferred1.pod6.iad1.zdsys.com (Postfix) with ESMTPS id D31684091244
+   for ; Wed, 14 Mar 2018 00:21:40 + (UTC)
+Received: from out4.pod6.iad1.zdsys.com (localhost.localdomain [127.0.0.1])
+   by out4.pod6.iad1.zdsys.com (Postfix) with ESMTP id CEBCE21CBE26
+   for ; Wed, 14 Mar 2018 00:21:39 + (UTC)
+Received: from zendesk.com (work11.pod6.iad1.zdsys.com [10.112.38.50])
+   by out4.pod6.iad1.zdsys.com (Postfix) with ESMTP id AFEB621CBE2F
+   for ; Wed, 14 Mar 2018 00:21:39 + (UTC)
+Date: Wed, 14 Mar 2018 00:21:39 +
+From: GitLab Support 
+Reply-To: GitLab Support 
+To: Anarcat+gitlab 
+Message-ID: <9379qm5z39_5aa86b1350504_174eb3fc97a2cb98c71674_sp...@zendesk.com>
+In-Reply-To: <9379qm5...@zendesk.com>
+ <9379qm5z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sp...@zendesk.com>
+Subject: comments not showing up?
+Mime-Version: 1.0
+Content-Type: multipart/alternative;
+ boundary="--==_mimepart_5aa86b13a739d_174eb3fc97a2cb98c71811";
+ charset=utf-8
+Content-Transfer-Encoding: 7bit
+X-Auto-Response-Suppress: All
+Auto-Submitted: auto-generated
+X-Mailer: Zendesk Mailer
+X-Delivery-Context: event-id-485367902188
+X-Zendesk-From-Account-Id: 2cbf2e0
+DKIM-Signature:  v=1; a=rsa-sha256; c=relaxed/relaxed; d=zendesk.com;
+ q=dns/txt; s=zendesk1; t=1520986899;
+ bh=ef6zoCM91gZVUja0Us+jy2UVcPK+sNhZocvn333kfzE=;
+ 
h=date:from:reply-to:to:message-id:in-reply-to:subject:mime-version:content-type:content-transfer-encoding;
+ 
b=tvPyIz3Cw61n5u2siOiyRiEMAKeDmEu2DMg1Ss534+0PPvbTgruWrWbZklJzy56RDIPi4hoK+Ui6gz0/ih6TyQXG6tpFMeZ4xI49Gqypu1Q2Xo1Uvu6WPYDe8n7D2BJ/8wP6+uqZ+DpAa7ldNi2opHVvmd6GKCuL0fN8lWvdDm4=
+DKIM-Signature:  v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=gmailmarkup.zendesk.com; q=dns/txt; s=zendesk1; t=1520986899;
+ bh=ef6zoCM91gZVUja0Us+jy2UVcPK+sNhZocvn333kfzE=;
+ 
h=date:from:reply-to:to:message-id:in-reply-to:subject:mime-version:content-type:content-transfer-encoding;
+ 
b=sqLp5VpKfrylgT2N7zbweDs3dccEXM44wokM/rxnZ49p9/wYDJNMbffB8yXXZa1BJ0KRfl/UFqoP8YZPYr72a+E291Ug+zq12UJi5MW2VnwMPJxAp+X9hQe90AzNecBDjOUn95qiCKnvVjhtT/LVePm9BbNh8UwC5W3qh/qFjVk=
+X-CMAE-Score: 0
+X-CMAE-Analysis: v=2.3 cv=RMOd4bq+ c=1 sm=1 tr=0
+   a=PnllObM1nxnwd3s59rm/oQ==:117 a=KiCxJD0x+Pe5VASQKmYoJrcyuOo=:19
+   a=IkcTkHD0fZMA:10 a=ZZnuYtJkoWoA:10 a=p0WdMEaf:8
+   a=mvNKaxnlhnHt-7JDRIsA:9 a=QEXdDO2ut3YA:10 a=iCiO2nKxGhgA:10
+   a=AxTCswu-:8 a=A-Ay9Xv3:8 a=TeuQqM9s:8 a=SSmOFEAC:8
+   a=9TiCOqs2P4ysbtUUcEZcEYoys8A=:19 a=DNiwoKe6oyfyDjaX:21 
a=_W_S_7VecoQA:10
+   a=frz4AuCg-hUA:10 a=grImUnVaQDeKGm2TToIA:22
+X-AV-Checked: Zendesk using ClamAV - CLEAN
+X-CMAE-Score: 0
+X-CMAE-Analysis: v=2.3 cv=RMOd4bq+ c=1 sm=1 tr=0
+   a=WkljmVdYkabdwxfqvArNOQ==:117 a=KiCxJD0x+Pe5VASQKmYoJrcyuOo=:19
+   a=IkcTkHD0fZMA:10 a=v2DPQv5-lfwA:10 a=ZZnuYtJkoWoA:10 a=p0WdMEaf:8
+   a=mvNKaxnlhnHt-7JDRIsA:9 a=QEXdDO2ut3YA:10 a=iCiO2nKxGhgA:10
+   a=AxTCswu-:8 a=A-Ay9Xv3:8 a=TeuQqM9s:8 a=SSmOFEAC:8
+   

reference loop fix and tests, v2

2018-04-13 Thread David Bremner
Here's a tidied version of the reference loop fix. There's actually
only one non-comment line of code changed. As you can guess from the
NEWS patch, I'm planning a point release to include this bugfix. I'd
also like to include the mset fix [1], so it would be great if someone
could test that.

[1]: id:20180406114307.7067-1-da...@tethera.net

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 4/5] test: add second reference loop test

2018-04-13 Thread David Bremner
Guard against regressions where there is no crash, but output is
wrong.
---
 test/T050-new.sh | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index 222c341e..91722e24 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -355,7 +355,16 @@ EOF
 test_expect_equal_file EXPECTED OUTPUT
 
 add_email_corpus broken
-test_begin_subtest "reference loop"
+test_begin_subtest "reference loop does not crash"
 test_expect_code 0 "notmuch show --format=json 
id:9379qm5z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sp...@zendesk.com > OUTPUT"
 
+test_begin_subtest "reference loop outputs both messages"
+threadid=$(notmuch search --output=threads  
id:9379qm5z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sp...@zendesk.com)
+notmuch show --format=mbox $threadid | grep '^Message-ID' | sort > OUTPUT
+cat <
+Message-ID: <9379qm5z39_5aa86b1350504_174eb3fc97a2cb98c71674_sp...@zendesk.com>
+EOF
+test_expect_equal_file /dev/null OUTPUT
+
 test_done
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 2/5] test: add known broken test for indexing an In-Reply-To loop.

2018-04-13 Thread David Bremner
This documents the bug discussed in

 id:87d10042pu@curie.anarc.at
---
 test/T050-new.sh | 5 +
 1 file changed, 5 insertions(+)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index cd522364..c55a2d97 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -354,4 +354,9 @@ exit status: 75
 EOF
 test_expect_equal_file EXPECTED OUTPUT
 
+add_email_corpus broken
+test_begin_subtest "reference loop"
+test_subtest_known_broken
+test_expect_code 0 "notmuch show --format=json 
id:9379qm5z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sp...@zendesk.com > OUTPUT"
+
 test_done
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 3/5] lib: break reference loop by choosing arbitrary top level msg

2018-04-13 Thread David Bremner
Other parts of notmuch (e.g. notmuch show) expect each thread to
contain at least one top level message, and crash if this expectation
is not met.
---
 lib/thread.cc| 8 +++-
 test/T050-new.sh | 1 -
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/thread.cc b/lib/thread.cc
index 3561b27f..dbac002f 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -397,7 +397,13 @@ _resolve_thread_relationships (notmuch_thread_t *thread)
 for (node = thread->message_list->head; node; node = node->next) {
message = node->message;
in_reply_to = _notmuch_message_get_in_reply_to (message);
-   if (in_reply_to && strlen (in_reply_to) &&
+   /*
+* if we reach the end of the list without finding a top-level
+* message, that means the thread is a cycle (or set of
+* cycles) and any message can be considered top-level
+*/
+   if ((thread->toplevel_list->head || node->next) &&
+in_reply_to && strlen (in_reply_to) &&
g_hash_table_lookup_extended (thread->message_hash,
  in_reply_to, NULL,
  (void **) ))
diff --git a/test/T050-new.sh b/test/T050-new.sh
index c55a2d97..222c341e 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -356,7 +356,6 @@ test_expect_equal_file EXPECTED OUTPUT
 
 add_email_corpus broken
 test_begin_subtest "reference loop"
-test_subtest_known_broken
 test_expect_code 0 "notmuch show --format=json 
id:9379qm5z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sp...@zendesk.com > OUTPUT"
 
 test_done
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 5/5] NEWS: add item for reference loop fix.

2018-04-13 Thread David Bremner
---
 NEWS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/NEWS b/NEWS
index 39ce7707..e9b1dcd9 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,12 @@
+Notmuch 0.26.2 (UNRELEASED)
+===
+
+Library Changes
+---
+
+Make thread indexing more robust against reference loops. Fix a
+related abort in "notmuch show".
+
 Notmuch 0.26.1 (2018-04-02)
 ===
 
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 2/2] test: pytest runner for the test suite

2018-04-13 Thread Floris Bruynooghe
On Sun 08 Apr 2018 at 19:14 -0300, David Bremner wrote:

> Floris Bruynooghe  writes:
>
>> This series looks good to me, would be great to have!  Do you want to
>> commit them this or should I just incorporate it and submit together
>> with tests once actual tests exist.  You could always commit with a ``def
>> test_dummy(): assert True`` or something if you like.
>>
>
> For now, why don't you just incorporate them. Maybe you'll discover some
> issues with them as you work up some real tests.

Sure, that works.

> BTW, it seems like a
> reasonable plan to get a set of unit tests for the existing bindings
> first to help with migration. Is that what you had in mind?

Yes that's what I had in mind.  To safely swap out ctypes for cffi
adding tests to ensure the existing API behaviour remains is a must.

Which reminds me that it's probably worth calling this out explicitly
again: this works implies that cffi will become an external dependency
[0] for the existing bindings.  Just want to make sure this doesn't
become an issue further down the line.

Cheers,
Floris

[0] Technically only on cpython, it's bundled with pypy.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] WIP: test patch for reference loop problem

2018-04-13 Thread David Bremner
Antoine Beaupré  writes:

> Hi!
>
> So I've tried the patch and it seems to fix the bug. I'll run with a
> patch version for a while to see if anything's off, but so far so good
> I'd say.
>
> Furthermore, it's not possible for me to reproduce the bug in my regular
> mailbox anymore. I suspect this is because new mail came in and the file
> order in the directories changed, so the bug isn't triggered anymore.
>
> I was able to trigger the bug with the reproducer with an older build of
> the code though, so don't worry about that part. :)

Thanks for testing!

> Let me know if you need anything else from me before this gets merged.
>

There was also a test patch, basically adding your reproducer to the
test suite. It would be good to know if that test still reproduces the
problem, before the fix is applied.  Tomi mentioned a more reduced test
set. That could reduce the privacy loss a bit, but as far as file size
these messages are pretty small, so I'm not sure if it's worth the
trouble/risk of breaking the reproducer.

I want to refactor the code a bit and hopefully cut down on the
copy-pasta, so I will probably ask you check a second version.

s


___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] WIP: add attachment checks

2018-04-13 Thread Antoine Beaupré
This implements basic attachment checks like those present in other
MUAs (e.g. Thunderbird, IIRC). A hook watches for keywords, which are
implemented using a customizable regex, that indicate the user might
have wanted to include an attachement while writing the email, but has
forgotten.

We currently check for words in english and french and some care was
taken to avoid false positive, but those are bound to happen and we
embrace them with open arms: it's better to warn by mistake than
forget an attachment ever again. New languages can be added through
customization, which help string suggests contributing it back to the
community here. I am not sure this is the best way, but I'm unfamiliar
with Notmuch's translation system (if any).

This code is working: I use it daily and never worry about missing
attachments anymore (mostly because I'm always *thinking* about
putting the attachment because I want this script to trigger, so I
never forget attachments anymore). However there are no unit tests:
bremner suggested I add some in test/T310-emacs.sh and I looked there
briefly, but couldn't figure out where to add a blurb.

I did some summary tests using `re-builder' to see the effect the
regex has on random strings. I tested the possible false positive "a
joint attaché presse" which doesn't match and "here is an attachment,
attached in the pièce jointe or piece jointe" which does. I am not
sure how to implement this in a unit test: should we try to send an
email and check if it aborts like the "Sending a message via (fake)
SMTP" test? Or should I just check that `notmuch-message-check-attach'
works against a temporary buffer? Is there a hardness that allows
matching against messages like that already?

I'd also welcome comments on the approach in general. Another user
came up on IRC recently (impatkor) with the same need and used the
following snippet instead:

https://pastebin.com/N9ku3DBD

It is a very similar implementation although it checks for
`Content-Disposition: attachment` instead of `<#part>`: not sure which
one is actually accurate. It also adds the word `bifoga` as a pattern,
but I haven't verified that in swedish and would wait for feedback on
the multilingual approach before adding new words here.

Finally, note that an earlier version of this used
`save-mark-and-excursion` but I had to revert back to `save-excursion`
because the function was missing in some other version of Emacs I was
testing. That part does not work anyways: something else is moving the
mark around when sending, but I figured I would keep this hook
well-behaving even if others screw that up.

Oh, and one last thing: this commit log is longer than the patch,
which is probably wrong. Sorry about that. :p
---
 emacs/notmuch-message.el | 37 +
 1 file changed, 37 insertions(+)

diff --git a/emacs/notmuch-message.el b/emacs/notmuch-message.el
index 55e4cfee..c4e301cc 100644
--- a/emacs/notmuch-message.el
+++ b/emacs/notmuch-message.el
@@ -47,6 +47,43 @@ the \"inbox\" and \"todo\" tags, you would set:
 
 (add-hook 'message-send-hook 'notmuch-message-mark-replied)
 
+;; attachment checks
+;;
+;; should be sent upstream, but needs unit tests in test/T310-emacs.sh
+(defcustom notmuch-message-attach-regex
+  "\\b\\(attache\?ment\\|attached\\|attach\\|pi[èe]ce\s+jointe?\\)\\b"
+  "Pattern of text announcing there should be an attachment.
+
+This is used by `notmuch-message-check-attach' to check email
+bodies for words that might indicate the email should have an
+attachement. If the pattern matches and there is no attachment (a
+`<#part ...>' magic block), notmuch will show a confirmation
+prompt before sending the email.
+
+The default regular expression is deliberately liberal: we prefer
+false positive than forgotten attachments. This should be
+customized for non-english languages and notmuch welcomes
+additions to the pattern for your native language, unless it
+conflicts with common words in other languages."
+  :type '(regexp)
+  :group 'notmuch-send)
+
+(defun notmuch-message-check-attach ()
+  """Check for missing attachments.
+
+This is normally added to `message-send-hook' and is configured
+through `notmuch-message-attach-regex'."""
+  (save-excursion ;; XXX: this fails somehow: point is at the end of the 
buffer on error
+(goto-char (point-min))
+(if (re-search-forward notmuch-attach-regex nil t)
+(progn
+  (goto-char (point-min))
+  (unless (re-search-forward "<#part [^>]*filename=[^>]*>" nil t)
+(or (y-or-n-p "Email seem to refer to attachment, but nothing 
attached, send anyways?")
+(error "No attachment found, aborting")))
+
+(add-hook 'message-send-hook 'notmuch-message-check-attach)
+
 (provide 'notmuch-message)
 
 ;;; notmuch-message.el ends here
-- 
2.11.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch