notmuch modifies DB while iterating?

2018-02-27 Thread Eric Wong
Hello, I'm neither a notmuch user or proficient in C++.

However, I noticed a bug while working on public-inbox (in Perl)
which shares Xapian thread linking logic with notmuch, and I
suspect notmuch is affected by the same problem as public-inbox.

The problem is in the _merge_threads function in add-message.cc
While the Xapian::PostingIterator for loser is iterating, the
Xapian DB is being modified by replace_document via
_notmuch_message_sync.

This was causing DatabaseCorruptError exceptions in public-inbox
with my dataset.

I fixed it in public-inbox by stashing docid scalars into a
Perl array while iterating with the PostingIterator, and then
doing lookups + replacements independently of the
PostingIterator by iterating through the Perl array:

https://public-inbox.org/meta/20180227221302.7308-...@80x24.org/raw


I initially thought this was a bug in the glass backend, but
I've also hit it with chert.

I have a standalone Perl script to reproduce the bug at
https://yhbt.net/skel.bug.perl and 81M gzipped dataset which
reproduces the problem at https://yhbt.net/skel.bug.gz

(each line is: MID [REFERENCES-SEPARATED-BY-SPACES])

Usage:

For failure:
  curl https://yhbt.net/skel.bug.gz | zcat | perl -w /path/to/skel.bug.perl

For success:
  curl https://yhbt.net/skel.bug.gz | zcat | \
BATCH_SIZE=1000 perl -w /path/to/skel.bug.perl
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] notmuch-mutt: use --format=text0 and xargs -0

2018-02-27 Thread Jani Nikula
On Tue, 27 Feb 2018, Jani Nikula  wrote:
> notmuch-mutt fails for message files with special characters such as
> single quote in their filename. Use notmuch search --format=text0 and
> xargs -0 combo to handle them.
>
> Reported and tested by "dob1" on IRC.
> ---
>  contrib/notmuch-mutt/notmuch-mutt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/notmuch-mutt/notmuch-mutt 
> b/contrib/notmuch-mutt/notmuch-mutt
> index 0e46a8c1b95e..57f13075aa22 100755
> --- a/contrib/notmuch-mutt/notmuch-mutt
> +++ b/contrib/notmuch-mutt/notmuch-mutt
> @@ -48,9 +48,9 @@ sub search($$$) {
>  }
>  
>  empty_maildir($maildir);
> -system("notmuch search --output=files $dup_option $query"
> +system("notmuch search --format=text0 --output=files $dup_option $query"
>  . " | sed -e 's: : :g'"

Come to think of it, does this need sed -z too?

> -. " | xargs -r -I searchoutput ln -s searchoutput $maildir/cur/");
> +. " | xargs -0 -r -I searchoutput ln -s searchoutput $maildir/cur/");
>  }
>  
>  sub prompt($$) {
> -- 
> 2.11.0
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] notmuch-mutt: use --format=text0 and xargs -0

2018-02-27 Thread Jani Nikula
notmuch-mutt fails for message files with special characters such as
single quote in their filename. Use notmuch search --format=text0 and
xargs -0 combo to handle them.

Reported and tested by "dob1" on IRC.
---
 contrib/notmuch-mutt/notmuch-mutt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/notmuch-mutt/notmuch-mutt 
b/contrib/notmuch-mutt/notmuch-mutt
index 0e46a8c1b95e..57f13075aa22 100755
--- a/contrib/notmuch-mutt/notmuch-mutt
+++ b/contrib/notmuch-mutt/notmuch-mutt
@@ -48,9 +48,9 @@ sub search($$$) {
 }
 
 empty_maildir($maildir);
-system("notmuch search --output=files $dup_option $query"
+system("notmuch search --format=text0 --output=files $dup_option $query"
   . " | sed -e 's: : :g'"
-  . " | xargs -r -I searchoutput ln -s searchoutput $maildir/cur/");
+  . " | xargs -0 -r -I searchoutput ln -s searchoutput $maildir/cur/");
 }
 
 sub prompt($$) {
-- 
2.11.0

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch