[WIP] lib: regexp matching in 'subject' and 'from'

2017-01-20 Thread David Bremner
the idea is that you can run

% notmuch search subject:
% notmuch search from:'

or

% notmuch search subject:"your usual phrase search"
% notmuch search from:"usual phrase search"

The heuristic to decide how to interepret the query is based on a
regex, roughly [a-z -]+

This should also work with bindings, since it extends the query parser.

This is trivial to extend for other value slots, but currently the only
value slots are date, message_id, from, subject, and last_mod. Date is
already searchable, and message_id is not obviously useful to regex
match.

This was originally written by Austin Clements, and ported to Xapian
field processors (from Austin's custom query parser) by yours truly.
---

It turns out to be not as hard as I thought to have the same field
interpreted as a regex search and as a regular xapian phrase search.
I haven't fixed the tests and docs yet because I'm not sure about the
best UI to trigger the regex search. Currently it just guesses based
on the string, but this has some surprising effects for
notmuch-address (hence the test breakage). Maybe from:/regex/ although
the quoting means this would look like from:"/regex/"

 doc/man7/notmuch-search-terms.rst |  17 -
 lib/Makefile.local|   1 +
 lib/database-private.h|   2 +
 lib/database.cc   |  29 +++-
 lib/regexp-fields.cc  | 142 ++
 lib/regexp-fields.h   |  81 ++
 test/T630-regexp-query.sh |  82 ++
 7 files changed, 350 insertions(+), 4 deletions(-)
 create mode 100644 lib/regexp-fields.cc
 create mode 100644 lib/regexp-fields.h
 create mode 100755 test/T630-regexp-query.sh

diff --git a/doc/man7/notmuch-search-terms.rst 
b/doc/man7/notmuch-search-terms.rst
index de93d733..8800039d 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -60,6 +60,8 @@ indicate user-supplied values):
 
 -  property:=
 
+- re_{subject,from}:
+
 The **from:** prefix is used to match the name or address of the sender
 of an email message.
 
@@ -146,6 +148,12 @@ The **property:** prefix searches for messages with a 
particular
 (and extensions) to add metadata to messages. A given key can be
 present on a given message with several different values.
 
+The **re_from:** and **re_subject** prefix can be used to restrict the
+results to those whose from/subject value matches the given regular
+expression (see **regex(7)**). Regular expression searches are only
+available if notmuch is built with **Xapian Field Processors** (see
+below).
+
 Operators
 -
 
@@ -220,13 +228,19 @@ Boolean and Probabilistic Prefixes
 --
 
 Xapian (and hence notmuch) prefixes are either **boolean**, supporting
-exact matches like "tag:inbox"  or **probabilistic**, supporting a more 
flexible **term** based searching. The prefixes currently supported by notmuch 
are as follows.
+exact matches like "tag:inbox" or **probabilistic**, supporting a more
+flexible **term** based searching. Certain **special** prefixes are
+processed by notmuch in a way not stricly fitting either of Xapian's
+built in styles. The prefixes currently supported by notmuch are as
+follows.
 
 
 Boolean
**tag:**, **id:**, **thread:**, **folder:**, **path:**, **property:**
 Probabilistic
**from:**, **to:**, **subject:**, **attachment:**, **mimetype:**
+Special
+   **query:**, **re:**
 
 Terms and phrases
 -
@@ -396,6 +410,7 @@ Currently the following features require field processor 
support:
 
 - non-range date queries, e.g. "date:today"
 - named queries e.g. "query:my_special_query"
+- regular expression searches, e.g. "re:subject:^\\[SPAM\\]"
 
 SEE ALSO
 
diff --git a/lib/Makefile.local b/lib/Makefile.local
index b77e5780..ff812b5f 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -52,6 +52,7 @@ libnotmuch_cxx_srcs = \
$(dir)/query.cc \
$(dir)/query-fp.cc  \
$(dir)/config.cc\
+   $(dir)/regexp-fields.cc \
$(dir)/thread.cc
 
 libnotmuch_modules := $(libnotmuch_c_srcs:.c=.o) $(libnotmuch_cxx_srcs:.cc=.o)
diff --git a/lib/database-private.h b/lib/database-private.h
index ccc1e9a1..9f5659a9 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -190,6 +190,8 @@ struct _notmuch_database {
 #if HAVE_XAPIAN_FIELD_PROCESSOR
 Xapian::FieldProcessor *date_field_processor;
 Xapian::FieldProcessor *query_field_processor;
+Xapian::FieldProcessor *from_field_processor;
+Xapian::FieldProcessor *subject_field_processor;
 #endif
 Xapian::ValueRangeProcessor *last_mod_range_processor;
 };
diff --git a/lib/database.cc b/lib/database.cc
index 2d19f20c..8a9ad251 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -21,6 +21,7 @@
 #include "database-private.h"
 #include "parse-time-vrp.h"
 #include "query-fp.h"
+#include "regexp-fields.h"
 #include 

Re: i do not have INBOX

2017-01-20 Thread Brian Sniffen
David Bremner  writes:

> David Belohrad  writes:
>
>>
>> my directory does not contain INBOX, as inside the Maildir folder is
>> directly /cur, /new, /tmp. How do I search in this particular one?
>
> Quoting notmuch-search-terms(7)
>
>The  exact  syntax for maildir folders depends on your mail configura‐
>tion. For maildir++, folder:"" matches the inbox folder (which is  the
>root  in  maildir++),  other  folder  names always start with ".", and
>nested folders are separated by "."s, such  as  folder:.classes.topol‐
>ogy.  For  "file  system" maildir, the inbox is typically folder:INBOX
>and   nested   folders   are   separated   by   slashes,suchas
>folder:classes/topology.
>
>> The question is related to usage of 'afew' to move all mails which
>> have 'deleted' tag into Trash folder (which I have under .Trash)
>
> Hopefully the above is enough to help you formulate the right query.
> You probably also want the options --format=text0 and --output=files for
> notmuch search.

The next paragraph of that man page has an important warning for anyone
planning on re-arranging files based on the results of notmuch:

   Both path: and folder: will find a message if any copy of that
   message  is  in  the specific directory/folder.

It's important to filter the names (e.g., `notmuch ...|grep -Fzv
/.|xargs -Ifoo mv foo $MAILDIR/.Trash/cur/`) before relying on them.

-Brian
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


i do not have INBOX

2017-01-20 Thread David Belohrad
Dear all,

notmuch search folder:


my directory does not contain INBOX, as inside the Maildir folder is directly 
/cur, /new, /tmp. How do I search in this particular one? The question is 
related to usage of 'afew' to move all mails which have 'deleted' tag into 
Trash folder (which I have under .Trash)

many thanks
.d.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch