Re: Bug/Issue: References header doesn't wrap in emacs package

2016-06-06 Thread Sanjoy Mahajan
On 2015-10-02 21:21, David Bremner  wrote:

>>> The problem is a References header that is too long/not wrapped.
>>>
>>
>> Hi Allan;
>>
>> Thanks for the report.  I can see how notmuch-reply would generate a
>> long references header in that situation. We rely on message-mode (part
>> of Gnus) to actually send the message, and it isn't clear to me yet if
>> message-mode (or some function it invokes) normally folds long headers.
>
> I have reported this as an emacs bug.
>
>   http://debbugs.gnu.org/cgi/bugreport.cgi?bug=21608

I'm not sure whether fixing it in emacs is right.  The command 'notmuch
reply' is itself (with the sexp or json formats) generating the too-long
References: header.  Shouldn't it generate an RFC-compliant message?

Or should the json/sexp formats remain agnostic about line length,
because wrapping doesn't make sense with key/value pairs?  In that case,
I agree that message-mode should fix any long lines.

By the way, the latest Exim release (4.87), at least as configured in
Debian, does limit lines to 998 characters, and that limit has been
causing me a few hiccups -- usually when I reply to a message sent by
gmail with a zillion references (perhaps it lists every message in the
thread).

-Sanjoy
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] WIP: regexp matching in subjects

2016-06-06 Thread David Bremner
the idea is that you can run

% notmuch search 'subject:rx:'

or

% notmuch search subject:"your usual phrase search"

This should also work with bindings.
---

Here is Austin's "hack", crammed into the field processor framework.
I seem to have broken one of the existing subject search tests with my
recursive query parsing. I didn't have time to figure out why, yet.

 lib/Makefile.local |  2 ++
 lib/database-private.h |  1 +
 lib/database.cc|  5 +++
 lib/regexp-ps.cc   | 92 ++
 lib/regexp-ps.h| 37 
 lib/subject-fp.cc  | 41 ++
 lib/subject-fp.h   | 43 +++
 7 files changed, 221 insertions(+)
 create mode 100644 lib/regexp-ps.cc
 create mode 100644 lib/regexp-ps.h
 create mode 100644 lib/subject-fp.cc
 create mode 100644 lib/subject-fp.h

diff --git a/lib/Makefile.local b/lib/Makefile.local
index beb9635..0e7311f 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -51,6 +51,8 @@ libnotmuch_cxx_srcs = \
$(dir)/query.cc \
$(dir)/query-fp.cc  \
$(dir)/config.cc\
+   $(dir)/regexp-ps.cc \
+   $(dir)/subject-fp.cc\
$(dir)/thread.cc
 
 libnotmuch_modules := $(libnotmuch_c_srcs:.c=.o) $(libnotmuch_cxx_srcs:.cc=.o)
diff --git a/lib/database-private.h b/lib/database-private.h
index ca71a92..5de0b81 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -186,6 +186,7 @@ struct _notmuch_database {
 #if HAVE_XAPIAN_FIELD_PROCESSOR
 Xapian::FieldProcessor *date_field_processor;
 Xapian::FieldProcessor *query_field_processor;
+Xapian::FieldProcessor *subject_field_processor;
 #endif
 Xapian::ValueRangeProcessor *last_mod_range_processor;
 };
diff --git a/lib/database.cc b/lib/database.cc
index 86bf261..adfbb81 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -21,6 +21,7 @@
 #include "database-private.h"
 #include "parse-time-vrp.h"
 #include "query-fp.h"
+#include "subject-fp.h"
 #include "string-util.h"
 
 #include 
@@ -1008,6 +1009,8 @@ notmuch_database_open_verbose (const char *path,
notmuch->query_parser->add_boolean_prefix("date", 
notmuch->date_field_processor);
notmuch->query_field_processor = new QueryFieldProcessor 
(*notmuch->query_parser, notmuch);
notmuch->query_parser->add_boolean_prefix("query", 
notmuch->query_field_processor);
+   notmuch->subject_field_processor = new SubjectFieldProcessor 
(*notmuch->query_parser, notmuch);
+   notmuch->query_parser->add_boolean_prefix("subject", 
notmuch->subject_field_processor);
 #endif
notmuch->last_mod_range_processor = new 
Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_LAST_MOD, "lastmod:");
 
@@ -1027,6 +1030,8 @@ notmuch_database_open_verbose (const char *path,
 
for (i = 0; i < ARRAY_SIZE (PROBABILISTIC_PREFIX); i++) {
prefix_t *prefix = &PROBABILISTIC_PREFIX[i];
+   if (strcmp (prefix->name, "subject") == 0)
+   continue;
notmuch->query_parser->add_prefix (prefix->name, prefix->prefix);
}
 } catch (const Xapian::Error &error) {
diff --git a/lib/regexp-ps.cc b/lib/regexp-ps.cc
new file mode 100644
index 000..540c7d6
--- /dev/null
+++ b/lib/regexp-ps.cc
@@ -0,0 +1,92 @@
+/* query-fp.cc - "query:" field processor glue
+ *
+ * This file is part of notmuch.
+ *
+ * Copyright © 2016 David Bremner
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see https://www.gnu.org/licenses/ .
+ *
+ * Author: Austin Clements 
+ *David Bremner 
+ */
+
+#include "regexp-ps.h"
+
+RegexpPostingSource::RegexpPostingSource (Xapian::valueno slot, const 
std::string ®exp)
+: slot_ (slot)
+{
+int r = regcomp (®exp_, regexp.c_str (), REG_EXTENDED | REG_NOSUB);
+
+if (r != 0)
+   /* XXX Report a query syntax error using regerror */
+   throw "regcomp failed";
+}
+
+RegexpPostingSource::~RegexpPostingSource ()
+{
+regfree (®exp_);
+}
+
+void
+RegexpPostingSource::init (const Xapian::Database &db)
+{
+db_ = db;
+it_ = db_.valuestream_begin (slot_);
+end_ = db.valuestream_end (slot_);
+started_ = false;
+}
+
+Xapian::doccount
+RegexpPostingSource::get_termfreq_min () const
+{
+return 0;
+}
+
+Xapian::doccount
+RegexpPostingSource::get_termfreq_est () const
+{
+return get_termfreq_max () / 2;
+}
+
+Xapian:

Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Austin Clements
Quoth Gaute Hope on Jun 06 at  8:08 pm:
> Austin Clements writes on juni 6, 2016 21:20:
> >
> >The experiment was specifically for regexp matching subject, but it should
> >work for any header we store a literal copy of in the database.
> 
> Does it work for terms in the body of the message?

No. It's not impossible that it could be made to work, but it might be
slow and unintuitive. It would have to iterate over all of the terms
in the database and see which ones match the regexp. These are
available, but I don't know how much time it takes to iterate over all
of them. It might be okay. It might not.

It could also expand to a very large query if the regexp matches many
terms, akin to how searching for "a*" can be quite expensive.

And it might not match what you expect. It could only match individual
terms, so a regexp containing any punctuation (including but not
limited to a space) simply wouldn't match anything.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Austin Clements
On Mon, Jun 6, 2016 at 1:29 PM, David Bremner  wrote:

> Sebastian Fischmeister  writes:
>
> >
> > I ran into this problem before as well. Storage is cheap. Notmuch could
> > index all emails with reversed text to get around some of this
> > problem. It doesn't solve the problem of *analysis*, but it's still an
> > improvement.
>
> It would probably be more useful to have brute force regexp searches on
> headers.  Austin did some experiments that sounded promising, where you
> basically postprocess the result of a xapian query with a regexp. OTOH,
> I don't know what kept him from proposing this for mainline. If it was
> just parser issues, those are probably more or less solved now, at least
> for people using xapian 1.3+
>

The experiment was specifically for regexp matching subject, but it should
work for any header we store a literal copy of in the database. The code is
here, though in its current form it builds on my custom query parser:
https://github.com/aclements/notmuch/commit/ce41b29aba4d9b84e2f1eb6ed8df67065196c960.
Based on my understanding of Xapian 1.3+ field processors, these days it
should be quite easy to hook the PostingSource in that commit into the
Xapian QueryProcessor.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Gaute Hope

Austin Clements writes on juni 6, 2016 21:20:


The experiment was specifically for regexp matching subject, but it should
work for any header we store a literal copy of in the database.


Does it work for terms in the body of the message?

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread David Bremner
Sebastian Fischmeister  writes:

>
> I ran into this problem before as well. Storage is cheap. Notmuch could
> index all emails with reversed text to get around some of this
> problem. It doesn't solve the problem of *analysis*, but it's still an
> improvement.

It would probably be more useful to have brute force regexp searches on
headers.  Austin did some experiments that sounded promising, where you
basically postprocess the result of a xapian query with a regexp. OTOH,
I don't know what kept him from proposing this for mainline. If it was
just parser issues, those are probably more or less solved now, at least
for people using xapian 1.3+

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Sebastian Fischmeister
>> It is not possible to use wildcards at the beginning of a term.
>
> after the current explanation to emphasize this limitation (possibly
> blaming Xapian to avoid futile requests).
>
> I think it is something many would expect (and want). The current
> description feels more like an example, and it is easy to make the
> assumption that it works for prefixing the terms as well - although,
> technically, nothing is promised in the original docs.

I ran into this problem before as well. Storage is cheap. Notmuch could
index all emails with reversed text to get around some of this
problem. It doesn't solve the problem of *analysis*, but it's still an
improvement.

  Sebastian
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] lib: fix definition of LIBNOTMUCH_CHECK_VERSION

2016-06-06 Thread David Bremner
Fix bug reported in id:20160606124522.g2y2eazhhrwjs...@flatcap.org

Although the C99 standard 6.10 is a little non-obvious on this point,
the docs for e.g. gcc are unambiguous. And indeed in practice with the
extra space, this code fails

int main(int argc, char **argv){
  printf("%d\n",foo(1));
}
---
 lib/notmuch.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/notmuch.h b/lib/notmuch.h
index 29713ae..d4a97cb 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -93,7 +93,7 @@ NOTMUCH_BEGIN_DECLS
  * #endif
  * @endcode
  */
-#define LIBNOTMUCH_CHECK_VERSION (major, minor, micro) \
+#define LIBNOTMUCH_CHECK_VERSION(major, minor, micro)  \
 (LIBNOTMUCH_MAJOR_VERSION > (major) || 
\
  (LIBNOTMUCH_MAJOR_VERSION == (major) && LIBNOTMUCH_MINOR_VERSION > 
(minor)) || \
  (LIBNOTMUCH_MAJOR_VERSION == (major) && LIBNOTMUCH_MINOR_VERSION == 
(minor) && \
-- 
2.1.4

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] lib: fix definition of LIBNOTMUCH_CHECK_VERSION

2016-06-06 Thread David Bremner
David Bremner  writes:

> Fix bug reported in id:20160606124522.g2y2eazhhrwjs...@flatcap.org
>
> Although the C99 standard 6.10 is a little non-obvious on this point,
> the docs for e.g. gcc are unambiguous. And indeed in practice with the
> extra space, this code fails
>
> int main(int argc, char **argv){
>   printf("%d\n",foo(1));
> }
> ---

Of course git removed the #define as a comment. sigh.

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


LIBNOTMUCH_CHECK_VERSION macro broken

2016-06-06 Thread Richard Russon
In  the macro definition begins:

#define LIBNOTMUCH_CHECK_VERSION (major, minor, micro)

There shouldn't be a space before the (

This is from the current version in git.

Cheers,
Rich / FlatCap

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Gaute Hope

David Bremner writes on juni 6, 2016 14:42:

Gaute Hope  writes:


Hi,

I have an email with the word 'reanalysis' in the subject line and the
email body. However, when I try to search for '*analysis' or 'analysis'
I do not get any matches, should not '*analysis' at least match?



We talked about this on IRC (the short answer is no), but is there some
improvement you could suggest to the "Wildcards" section in
notmuch-search-terms(7) ?


Yes, thanks, not very important, but maybe add the sentence:


It is not possible to use wildcards at the beginning of a term.


after the current explanation to emphasize this limitation (possibly
blaming Xapian to avoid futile requests).

I think it is something many would expect (and want). The current
description feels more like an example, and it is easy to make the
assumption that it works for prefixing the terms as well - although,
technically, nothing is promised in the original docs.

-gaute


___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread David Bremner
Gaute Hope  writes:

> Hi,
>
> I have an email with the word 'reanalysis' in the subject line and the
> email body. However, when I try to search for '*analysis' or 'analysis'
> I do not get any matches, should not '*analysis' at least match?
>

We talked about this on IRC (the short answer is no), but is there some
improvement you could suggest to the "Wildcards" section in
notmuch-search-terms(7) ?

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch