[PATCH 1/6] test: two new messages for the 'broken' corpus

2018-04-20 Thread David Bremner
These have an 'In-Reply-To' loop, which currently confuses "notmuch
new".
---
 test/corpora/broken/loop/loop-12 | 8 
 test/corpora/broken/loop/loop-21 | 8 
 2 files changed, 16 insertions(+)
 create mode 100644 test/corpora/broken/loop/loop-12
 create mode 100644 test/corpora/broken/loop/loop-21

diff --git a/test/corpora/broken/loop/loop-12 b/test/corpora/broken/loop/loop-12
new file mode 100644
index ..b5c3af7e
--- /dev/null
+++ b/test/corpora/broken/loop/loop-12
@@ -0,0 +1,8 @@
+From: Alice 
+To: Daniel 
+Subject: referencing in-reply-to-loop-21
+Message-ID: 
+In-Reply-To: 
+Date: Thu, 16 Jun 2016 22:14:41 -0400
+
+Note Message-ID and In-Reply-To: in file in-reply-to-loop-21
diff --git a/test/corpora/broken/loop/loop-21 b/test/corpora/broken/loop/loop-21
new file mode 100644
index ..234f0323
--- /dev/null
+++ b/test/corpora/broken/loop/loop-21
@@ -0,0 +1,8 @@
+From: Alice 
+To: Daniel 
+Subject: referencing in-reply-to-loop-12
+Message-ID: 
+In-Reply-To: 
+Date: Fri, 17 Jun 2016 22:14:41 -0400
+
+Note Message-ID and In-Reply-To: in file in-reply-to-loop-12
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 6/6] lib: choose oldest message when breaking reference loops

2018-04-20 Thread David Bremner
This preserves a sensible thread order
---
 lib/thread.cc| 33 -
 test/T050-new.sh |  1 -
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/lib/thread.cc b/lib/thread.cc
index dbac002f..a6dc4e5a 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -390,20 +390,18 @@ _thread_add_matched_message (notmuch_thread_t *thread,
 static void
 _resolve_thread_relationships (notmuch_thread_t *thread)
 {
-notmuch_message_node_t *node;
+notmuch_message_node_t *node, *first_node;
 notmuch_message_t *message, *parent;
 const char *in_reply_to;
 
-for (node = thread->message_list->head; node; node = node->next) {
+first_node = thread->message_list->head;
+if (!first_node)
+   return;
+
+for (node = first_node->next; node; node = node->next) {
message = node->message;
in_reply_to = _notmuch_message_get_in_reply_to (message);
-   /*
-* if we reach the end of the list without finding a top-level
-* message, that means the thread is a cycle (or set of
-* cycles) and any message can be considered top-level
-*/
-   if ((thread->toplevel_list->head || node->next) &&
-in_reply_to && strlen (in_reply_to) &&
+   if (in_reply_to && strlen (in_reply_to) &&
g_hash_table_lookup_extended (thread->message_hash,
  in_reply_to, NULL,
  (void **) &parent))
@@ -412,6 +410,23 @@ _resolve_thread_relationships (notmuch_thread_t *thread)
_notmuch_message_list_add_message (thread->toplevel_list, message);
 }
 
+/*
+ * if we reach the end of the list without finding a top-level
+ * message, that means the thread is a cycle (or set of cycles)
+ * and any message can be considered top-level.  Choose the oldest
+ * message, which happens to be first in our list.
+ */
+message=first_node->message;
+in_reply_to = _notmuch_message_get_in_reply_to (message);
+if (thread->toplevel_list->head &&
+   in_reply_to && strlen (in_reply_to) &&
+   g_hash_table_lookup_extended (thread->message_hash,
+ in_reply_to, NULL,
+ (void **) &parent))
+   _notmuch_message_add_reply (parent, message);
+else
+   _notmuch_message_list_add_message (thread->toplevel_list, message);
+
 /* XXX: After scanning through the entire list looking for parents
  * via "In-Reply-To", we should do a second pass that looks at the
  * list of messages IDs in the "References" header instead. (And
diff --git a/test/T050-new.sh b/test/T050-new.sh
index 320a7646..9025fa7a 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -359,7 +359,6 @@ test_begin_subtest "reference loop does not crash"
 test_expect_code 0 "notmuch show --format=json id:mid-loop...@example.org 
id:mid-loop...@example.org > OUTPUT"
 
 test_begin_subtest "reference loop ordered by date"
-test_subtest_known_broken
 threadid=$(notmuch search --output=threads  id:mid-loop...@example.org)
 notmuch show --format=mbox $threadid | grep '^Date'  > OUTPUT
 cat < EXPECTED
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 4/6] NEWS: add item for reference loop fix.

2018-04-20 Thread David Bremner
---
 NEWS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/NEWS b/NEWS
index 39ce7707..e9b1dcd9 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,12 @@
+Notmuch 0.26.2 (UNRELEASED)
+===
+
+Library Changes
+---
+
+Make thread indexing more robust against reference loops. Fix a
+related abort in "notmuch show".
+
 Notmuch 0.26.1 (2018-04-02)
 ===
 
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 5/6] test: add known broken test for thread ordering from a loop

2018-04-20 Thread David Bremner
The previous loop handling code chooses the last message in the
message list, which turns out to be the last in date order.
See the comment in _notmuch_thread_create.
---
 test/T050-new.sh | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index f3bfe7b1..320a7646 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -358,4 +358,14 @@ add_email_corpus broken
 test_begin_subtest "reference loop does not crash"
 test_expect_code 0 "notmuch show --format=json id:mid-loop...@example.org 
id:mid-loop...@example.org > OUTPUT"
 
+test_begin_subtest "reference loop ordered by date"
+test_subtest_known_broken
+threadid=$(notmuch search --output=threads  id:mid-loop...@example.org)
+notmuch show --format=mbox $threadid | grep '^Date'  > OUTPUT
+cat < EXPECTED
+Date: Thu, 16 Jun 2016 22:14:41 -0400
+Date: Fri, 17 Jun 2016 22:14:41 -0400
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
 test_done
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 3/6] lib: break reference loop by choosing arbitrary top level msg

2018-04-20 Thread David Bremner
Other parts of notmuch (e.g. notmuch show) expect each thread to
contain at least one top level message, and crash if this expectation
is not met.
---
 lib/thread.cc| 8 +++-
 test/T050-new.sh | 1 -
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/thread.cc b/lib/thread.cc
index 3561b27f..dbac002f 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -397,7 +397,13 @@ _resolve_thread_relationships (notmuch_thread_t *thread)
 for (node = thread->message_list->head; node; node = node->next) {
message = node->message;
in_reply_to = _notmuch_message_get_in_reply_to (message);
-   if (in_reply_to && strlen (in_reply_to) &&
+   /*
+* if we reach the end of the list without finding a top-level
+* message, that means the thread is a cycle (or set of
+* cycles) and any message can be considered top-level
+*/
+   if ((thread->toplevel_list->head || node->next) &&
+in_reply_to && strlen (in_reply_to) &&
g_hash_table_lookup_extended (thread->message_hash,
  in_reply_to, NULL,
  (void **) &parent))
diff --git a/test/T050-new.sh b/test/T050-new.sh
index b9854978..f3bfe7b1 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -356,7 +356,6 @@ test_expect_equal_file EXPECTED OUTPUT
 
 add_email_corpus broken
 test_begin_subtest "reference loop does not crash"
-test_subtest_known_broken
 test_expect_code 0 "notmuch show --format=json id:mid-loop...@example.org 
id:mid-loop...@example.org > OUTPUT"
 
 test_done
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 2/6] test: add known broken test for indexing an In-Reply-To loop.

2018-04-20 Thread David Bremner
This documents the bug discussed in

 id:87d10042pu@curie.anarc.at
---
 test/T050-new.sh | 5 +
 1 file changed, 5 insertions(+)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index cd522364..b9854978 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -354,4 +354,9 @@ exit status: 75
 EOF
 test_expect_equal_file EXPECTED OUTPUT
 
+add_email_corpus broken
+test_begin_subtest "reference loop does not crash"
+test_subtest_known_broken
+test_expect_code 0 "notmuch show --format=json id:mid-loop...@example.org 
id:mid-loop...@example.org > OUTPUT"
+
 test_done
-- 
2.17.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


v3 reference loop fix

2018-04-20 Thread David Bremner
this version obsoletes

 id:20180414014610.15438-1-da...@tethera.net

changes:

- simplified test messages from Tomi
- deterministic thread order suggested by dkg



___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Bug? notmuch-emacs: "All tags" in notmuch-hello does not show all tags

2018-04-20 Thread Gregor Zattler
Dear notmuch-emacs developers,

I use a certain tag which until now is only used with regards to
one single message which is also tagged "spam".  This certain tag
is among the output of notmuch search --output=tags '*'

The part of notmuch-hello which is supposed to show "All tags"
does not show this certain tag.  But it shows the certain tag with
count = 1 if I remove the tag "spam" from this message.  The same
happens if this message is tagged "deleted" instead of "spam".


My ~/.notmuch-config contains:

[search]
exclude_tags=deleted;spam;


The relevant section of my customzation.el contains:

 '(notmuch-hello-sections
   '(notmuch-hello-insert-search notmuch-hello-insert-saved-searches 
notmuch-hello-insert-recent-searches
 (notmuch-hello-insert-tags-section "All tags" 
:initially-hidden t nil nil)))


I can show the message in question with a search for
(is:spam OR NOT is:spam OR is:deleted OR NOT is:deleted) AND is:certaintag
notmuch-emacs then shows the single message and its tags.  Among
those is the certain tag.


I expected the certain tag to be shown in the notmuch-hello
section "All tags" and consider it a bug that this is not the case.

My guess is that somewhere in the code notmuch count is called
without --exclude=false.


Ciao; Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: Database error

2018-04-20 Thread Varac
Quoting David Bremner (2018-04-20 14:44:25)
> >   » xapian-check .notmuch/xapian docdata:
> >   blocksize=8K items=564 firstunused=7 revision=7 levels=1 root=2
> >   B-tree checked okay
> >   docdata table structure checked OK
> >
> >   termlist:
> >   blocksize=8K items=305882 firstunused=64506 revision=7 levels=2 root=63491
> >   xapian-check: DatabaseError: 1 unused block(s) missing from the free 
> > list, first is 0
> 
> 
> This is almost certainly a known and recently fixed xapian bug. notmuch
> compact is the recommended workaround (or just ignore the message).
Ok, I'll ignore it for now.

> 
> >
> > After this I try to tun "afew --tag --new -v" but it fails with (I don't 
> > know if 
> > this is related though):
> >
> >   Traceback (most recent call last):
> > File "/usr/bin/afew", line 11, in 
> >   load_entry_point('afew==1.3.0', 'console_scripts', 'afew')()
> > File "/usr/lib/python3/dist-packages/afew/commands.py", line 159, in 
> > main
> >   inner_main(args, database, query_string)
> > File "/usr/lib/python3/dist-packages/afew/main.py", line 23, in main
> >   filter_.run(query_string)
> > File "/usr/lib/python3/dist-packages/afew/filters/BaseFilter.py", line 
> > 59, in run
> >   for message in self.database.get_messages(query):
> > File "/usr/lib/python3/dist-packages/afew/Database.py", line 93, in 
> > get_messages
> >   for message in self.do_query(query).search_messages():
> > File "/usr/lib/python3/dist-packages/notmuch/query.py", line 182, in 
> > search_messages
> >   raise NotmuchError(status)
> >   notmuch.errors.XapianError
> >
> 
> It's hard to say for sure, but it's possible this would be fixed by the
> patch
> 
> 20180414014610.15438-4-david@tethera.net">https://mid.mail-archive.com/20180414014610.15438-4-david@tethera.net
> 
> That should be part of a bug-fix release of notmuch fairly soon. You can
> also try to find a subset of mail reproducing the problem. I don't
> know what that afew invokation is trying to do. Can you reproduce the
> problem using the notmuch CLI?

It turned out that this was a bad syntax in afew, where I quoted a query in 
single quotes.

Greetings, Varac


signature.asc
Description: signature
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] lib: work around xapian bug with get_mset(0,0, x)

2018-04-20 Thread David Bremner
Daniel Kahn Gillmor  writes:


> But if we want a compile-time check, i don't think we'd need anything
> fancier than something like (potentially even directly in query.cc):
>
> #if XAPIAN_AT_LEAST(1,4,6)
> #define MSET_GET_MIN_COUNT 0
> #else
> #define MSET_GET_MIN_COUNT 1
> #endif
>
> what else were you thinking we'd need?
>
>  --dkg

I forgot about the XAPIAN_AT_LEAST macro. Nonetheless, I'm not keen to
ship untested/uncompiled code (no matter how "obviously correct" it is),
so I'd prefer to wait until Xapian 1.4.6 actually exists before adding
an ifdef like that. FWIW I doubt that the performance impact of passing
1 instead of 0 here is measurable. For me, getting a bug fix release out
is a matter of some urgency; fine tuning and testing against xapian
1.4.6 can wait.

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: Database error

2018-04-20 Thread David Bremner
Varac  writes:

> Hi,
>
> I get a DatabaseError right after initial indexing.
> I can reproduce this with my mails every time.
> First, I remove the whole .notmuch/xapian directory and
> run "notmuch new", which results in a DatabaseError when
> using "xapian-check":
>
>
>   » notmuch new
>   Welcome to a new version of notmuch! Your database will now be upgraded.
>   This process is safe to interrupt.
>   Backing up tags to 
> /home/varac/.offlineimap/Mail/.notmuch/dump-20180419T160138.gz...
>   Your notmuch database has now been upgraded.
>   Processed 148440 total files in 13m 38s (181 files/sec.).
>   Added 140351 new messages to the database.
>
>   » xapian-check .notmuch/xapian 
>   docdata:
>   blocksize=8K items=564 firstunused=7 revision=7 levels=1 root=2
>   B-tree checked okay
>   docdata table structure checked OK
>
>   termlist:
>   blocksize=8K items=305882 firstunused=64506 revision=7 levels=2 root=63491
>   xapian-check: DatabaseError: 1 unused block(s) missing from the free list, 
> first is 0


This is almost certainly a known and recently fixed xapian bug. notmuch
compact is the recommended workaround (or just ignore the message).

>
> After this I try to tun "afew --tag --new -v" but it fails with (I don't know 
> if 
> this is related though):
>
>   Traceback (most recent call last):
> File "/usr/bin/afew", line 11, in 
>   load_entry_point('afew==1.3.0', 'console_scripts', 'afew')()
> File "/usr/lib/python3/dist-packages/afew/commands.py", line 159, in main
>   inner_main(args, database, query_string)
> File "/usr/lib/python3/dist-packages/afew/main.py", line 23, in main
>   filter_.run(query_string)
> File "/usr/lib/python3/dist-packages/afew/filters/BaseFilter.py", line 
> 59, in run
>   for message in self.database.get_messages(query):
> File "/usr/lib/python3/dist-packages/afew/Database.py", line 93, in 
> get_messages
>   for message in self.do_query(query).search_messages():
> File "/usr/lib/python3/dist-packages/notmuch/query.py", line 182, in 
> search_messages
>   raise NotmuchError(status)
>   notmuch.errors.XapianError
>

It's hard to say for sure, but it's possible this would be fixed by the
patch

20180414014610.15438-4-david@tethera.net">https://mid.mail-archive.com/20180414014610.15438-4-david@tethera.net

That should be part of a bug-fix release of notmuch fairly soon. You can
also try to find a subset of mail reproducing the problem. I don't
know what that afew invokation is trying to do. Can you reproduce the
problem using the notmuch CLI?

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch