[PATCH] lib: add 'body:' field, stop indexing headers twice.

2019-03-03 Thread David Bremner
The new `body:` field (in Xapian terms) or prefix (in slightly
sloppier notmuch) terms allows matching terms that occur only in the
body.

Unprefixed query terms should continue to match anywhere (header or
body) in the message.

This follows a suggestion of Olly Betts to use the facility (since
Xapian 1.0.4) to add the same field with multiple prefixes. The double
indexing of previous versions is thus replaced with a query time
expension of unprefixed query terms to the various prefixed
equivalent.

Reindexing will be needed for negated 'body:' searches to work
correctly.
---
 doc/man7/notmuch-search-terms.rst |  5 +++-
 lib/database.cc   |  6 +
 lib/message.cc| 10 +++
 test/T730-body.sh | 43 +++
 4 files changed, 58 insertions(+), 6 deletions(-)
 create mode 100755 test/T730-body.sh

diff --git a/doc/man7/notmuch-search-terms.rst 
b/doc/man7/notmuch-search-terms.rst
index f7a39ceb..fd8bf634 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -44,6 +44,9 @@ results to those whose value matches a regular expression (see
 
notmuch search 'from:"/bob@.*[.]example[.]com/"'
 
+body:
+Match terms in the body of messages.
+
 from: or from://
 The **from:** prefix is used to match the name or address of
 the sender of an email message.
@@ -249,7 +252,7 @@ follows.
 Boolean
**tag:**, **id:**, **thread:**, **folder:**, **path:**, **property:**
 Probabilistic
-  **to:**, **attachment:**, **mimetype:**
+  **body:**, **to:**, **attachment:**, **mimetype:**
 Special
**from:**, **query:**, **subject:**
 
diff --git a/lib/database.cc b/lib/database.cc
index 9cf8062c..27c2d042 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -259,6 +259,8 @@ prefix_t prefix_table[] = {
 { "directory", "XDIRECTORY",   NOTMUCH_FIELD_NO_FLAGS },
 { "file-direntry", "XFDIRENTRY",   NOTMUCH_FIELD_NO_FLAGS },
 { "directory-direntry","XDDIRENTRY",   NOTMUCH_FIELD_NO_FLAGS },
+{ "body",  "", NOTMUCH_FIELD_EXTERNAL |
+   NOTMUCH_FIELD_PROBABILISTIC},
 { "thread","G",NOTMUCH_FIELD_EXTERNAL |
NOTMUCH_FIELD_PROCESSOR },
 { "tag",   "K",NOTMUCH_FIELD_EXTERNAL |
@@ -302,6 +304,8 @@ prefix_t prefix_table[] = {
 static void
 _setup_query_field_default (const prefix_t *prefix, notmuch_database_t 
*notmuch)
 {
+if (prefix->prefix)
+   notmuch->query_parser->add_prefix("",prefix->prefix);
 if (prefix->flags & NOTMUCH_FIELD_PROBABILISTIC)
notmuch->query_parser->add_prefix (prefix->name, prefix->prefix);
 else
@@ -326,6 +330,8 @@ _setup_query_field (const prefix_t *prefix, 
notmuch_database_t *notmuch)
*notmuch->query_parser, 
notmuch))->release ();
 
/* we treat all field-processor fields as boolean in order to get the 
raw input */
+   if (prefix->prefix)
+   notmuch->query_parser->add_prefix("",prefix->prefix);
notmuch->query_parser->add_boolean_prefix (prefix->name, fp);
 } else {
_setup_query_field_default (prefix, notmuch);
diff --git a/lib/message.cc b/lib/message.cc
index 6f2f6345..64349f83 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -1443,13 +1443,13 @@ _notmuch_message_gen_terms (notmuch_message_t *message,
message->termpos = term_gen->get_termpos () + 100;
 
_notmuch_message_invalidate_metadata (message, prefix_name);
+} else {
+   term_gen->set_termpos (message->termpos);
+   term_gen->index_text (text);
+   /* Create a term gap, as above. */
+   message->termpos = term_gen->get_termpos () + 100;
 }
 
-term_gen->set_termpos (message->termpos);
-term_gen->index_text (text);
-/* Create a term gap, as above. */
-message->termpos = term_gen->get_termpos () + 100;
-
 return NOTMUCH_PRIVATE_STATUS_SUCCESS;
 }
 
diff --git a/test/T730-body.sh b/test/T730-body.sh
new file mode 100755
index ..548b30a4
--- /dev/null
+++ b/test/T730-body.sh
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+test_description='search body'
+. $(dirname "$0")/test-lib.sh || exit 1
+
+add_message "[body]=thebody-1" "[subject]=subject-1"
+add_message "[body]=nothing-to-see-here-1" "[subject]=thebody-1"
+
+test_begin_subtest 'search with body: prefix'
+notmuch search body:thebody | notmuch_search_sanitize > OUTPUT
+cat < EXPECTED
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; subject-1 (inbox unread)
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest 'search without body: prefix'
+notmuch search thebody | notmuch_search_sanitize > OUTPUT
+cat < EXPECTED
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; subject-1 (inbox unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; thebody-1 (inbox unread)
+EOF
+test

Re: WIP2: index user headers

2019-03-03 Thread David Bremner
David Bremner  writes:

> I had another thought about user prefixes. I wonder if they should all
> be forcibly prefixed with something that prevents collisions, to prevent
> later pain if we add an "official" prefix with the same name. A quick
> tests suggest it would work to use something like _
>
> so
>
> notmuch search --output=files _list:notmuch
>
> works. It's a bit ugly, I'll have to play with other options; the main
> question is whether we think prefixing is needed / worth-it.

I played with the query parser a bit, and the only idea I found so far
is to reserve prefixes starting with lower case ASCII for notmuch, and
allow users to use anything else.

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] doc: sequentialize calls to sphinx-build

2019-03-03 Thread David Bremner
In certain conditions the parallel calls to sphinx-build could
collide, yielding a crash like

Exception occurred:
  File "/usr/lib/python3/dist-packages/sphinx/environment.py", line 1261, in 
get_doctree
doctree = pickle.load(f)
EOFError: Ran out of input
---

I first observed this on Debian/ppc64le. There it occured about 1 in
50 builds. Thanks to Olly Betts for pointing me in the right
direction.

 doc/Makefile.local | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/doc/Makefile.local b/doc/Makefile.local
index 16459e35..eed243a0 100644
--- a/doc/Makefile.local
+++ b/doc/Makefile.local
@@ -37,6 +37,13 @@ INFO_INFO_FILES := $(INFO_TEXI_FILES:.texi=.info)
 %.gz: %
rm -f $@ && gzip --stdout $^ > $@
 
+# Sequentialize the calls to sphinx-build to avoid races with
+# reading/writing state.
+
+sphinx-html: | $(DOCBUILDDIR)/.roff.stamp
+sphinx-texinfo: | sphinx-html
+sphinx-info: | sphinx-texinfo
+
 sphinx-html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(DOCBUILDDIR)/html
 
-- 
2.20.1

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] Fix notmuch-describe-key

2019-03-03 Thread William Casarin
Yang Sheng  writes:

> Fix notmuch-describe-key crashing for the following two cases
> 1. format-kbd-macro cannot deal with keys like [(32 . 126)], switch to
> use key-description instead.
> 2. if a function in the current keymap is not bounded, it will crash
> the whole process. We check if it is bounded and silently skip it to
> avoid crashing.
> ---
>  emacs/notmuch-lib.el | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
> index 8cf7261e..546ab6fd 100644
> --- a/emacs/notmuch-lib.el
> +++ b/emacs/notmuch-lib.el
> @@ -298,7 +298,7 @@ This is basically just `format-kbd-macro' but we also 
> convert ESC to M-."
>"Prepend cons cells describing prefix-arg ACTUAL-KEY and ACTUAL-KEY to TAIL
>  
>  It does not prepend if ACTUAL-KEY is already listed in TAIL."
> -  (let ((key-string (concat prefix (format-kbd-macro actual-key
> +  (let ((key-string (concat prefix (key-description actual-key
>  ;; We don't include documentation if the key-binding is
>  ;; over-ridden. Note, over-riding a binding automatically hides the
>  ;; prefixed version too.
> @@ -313,7 +313,7 @@ It does not prepend if ACTUAL-KEY is already listed in 
> TAIL."
>;; Documentation for command
>(push (cons key-string
> (or (and (symbolp binding) (get binding 'notmuch-doc))
> -   (notmuch-documentation-first-line binding)))
> +   (and (functionp binding) 
> (notmuch-documentation-first-line binding
>   tail)))
>  tail)
>  

Thanks!

Some context: this fixes an issue in spacemacs where the help key is
broken:

  https://github.com/syl20bnr/spacemacs/issues/10123


-- 
https://jb55.com
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: WIP2: index user headers

2019-03-03 Thread David Bremner
David Bremner  writes:

> This obsoletes [1]
> This is getting closer to mergable, but it still needs at least to
> sanity check the names of user defined prefixes (see point (a) below).
>
> The main differences from [1] are
>
> (a) xapian prefixes are no longer defined via upper casing, as this is
> locale dependent. The do rely on a ":" separator, hence the need
> for some sanitization.
>
> (b) The caching of user header/prefix information is now done via
> string maps, and used more effectively during indexing.

I had another thought about user prefixes. I wonder if they should all
be forcibly prefixed with something that prevents collisions, to prevent
later pain if we add an "official" prefix with the same name. A quick
tests suggest it would work to use something like _

so

notmuch search --output=files _list:notmuch

works. It's a bit ugly, I'll have to play with other options; the main
question is whether we think prefixing is needed / worth-it.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: Reply to content of "List-Post" header?

2019-03-03 Thread Ralph Seichter
* Gregor Zattler:

> my procmmail scripts recognize mailing list headers and in the end
> there is a list of all mailing lists I ever got mails from [...]

Interesting method. I don't use procmail, but I wrote a small shell
script that scans my existing Maildir storage for List-Post headers and
extracts the addresses.

This works, in a fashion, but I wonder if/how it would be possible to
make Notmuch understand List-Post natively? Or is this something that
needs to be solved in Emacs' message composition?

-Ralph
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch