[PATCH] contrib: pick: close message pane when quitting from show in the message pane

2012-12-25 Thread David Bremner
Mark Walters  writes:

> We add a hook to the show buffer in the message window to close the
> message window when that buffer quits.  It checks that the
> message-window is still displaying the show-message buffer and then
> closes it.

pushed.

d


[PATCH] emacs: tweak error buffer handling

2012-12-25 Thread Tomi Ollila
On Sat, Dec 22 2012, Mark Walters  wrote:

> view-mode-enter changed between emacs 23 and emacs 24: the current
> code makes the error buffer disappear in emacs 24 on quitting it (ie
> pressing q) but this just kills the buffer without closing the split
> window in emacs 23.
>
> This patch makes the error buffer window disappear in emacs 23
> too. Since the view-mode-enter function changed we have to test for
> version and do the correct thing in each case.
> ---
>
> This seems to work but I have only tested on 23.4 and 24.2

I run emacs 23.1.1 to get the documentation of view-mode-enter
there. So, this patch instructs to delete WINDOW when exiting
view mode...

Documentation of pop-to-buffer says:

"Select buffer BUFFER-OR-NAME in some window, preferably a different one."

What if pop-up-windows's value is nil -- the content of current window
is replaced with this view stuff -- and when exiting view mode, the
window will be deleted ? What happens with emacs 24 in this case ?

Tomi

> Best wishes
>
> Mark
>
>
>
>  emacs/notmuch-lib.el |8 +---
>  1 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
> index 77a591d..0407f8a 100644
> --- a/emacs/notmuch-lib.el
> +++ b/emacs/notmuch-lib.el
> @@ -324,15 +324,17 @@ the user dismisses it."
>  
>(let ((buf (get-buffer-create "*Notmuch errors*")))
>  (with-current-buffer buf
> -  (view-mode-enter nil #'kill-buffer)
> +  (pop-to-buffer buf)
> +  (view-mode-enter (when (< emacs-major-version 24)
> +(cons (selected-window) (cons nil t)))
> +#'kill-buffer)
>(let ((inhibit-read-only t))
>   (goto-char (point-max))
>   (unless (bobp)
> (insert "\n"))
>   (insert msg)
>   (unless (bolp)
> -   (insert "\n"
> -(pop-to-buffer buf)))
> +   (insert "\n"))
>  
>  (defun notmuch-check-async-exit-status (proc msg)
>"If PROC exited abnormally, pop up an error buffer and signal an error.
> -- 
> 1.7.9.1
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 5/5] man: Update notmuch-dump(1) and notmuch-restore(1)

2012-12-25 Thread Austin Clements
Describe the new batch-tag format.  For notmuch-restore, rather than
half-heartedly duplicating the description, we now cite notmuch-dump.
---
 man/man1/notmuch-dump.1|   11 +++
 man/man1/notmuch-restore.1 |6 ++
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/man/man1/notmuch-dump.1 b/man/man1/notmuch-dump.1
index 770b00f..7bd6def 100644
--- a/man/man1/notmuch-dump.1
+++ b/man/man1/notmuch-dump.1
@@ -64,13 +64,16 @@ and tags containing whitespace or non-\fBascii\fR(7) 
characters.
 Each line has the form

 .RS 4
-.RI "+<" "encoded-tag" "> " "" "+<" "encoded-tag" "> ... -- " "" " id:<" 
encoded-message-id >
+.RI "+<" "encoded-tag" "> " "" "+<" "encoded-tag" "> ... -- " "" " id:<" 
quoted-message-id >

-where encoded means that every byte not matching the regex
+Tags are hex-encoded by replacing every byte not matching the regex
 .B [A-Za-z0-9@=.,_+-]
-is replace by
+with
 .B %nn
-where nn is the two digit hex encoding.
+where nn is the two digit hex encoding.  The message ID is a valid Xapian
+query, quoted using Xapian boolean term quoting rules: if the ID contains
+whitespace or a close paren or starts with a double quote, it must be
+enclosed in double quotes and double quotes inside the ID must be doubled.
 The astute reader will notice this is a special case of the batch input
 format for \fBnotmuch-tag\fR(1); note that the single message-id query is
 mandatory for \fBnotmuch-restore\fR(1).
diff --git a/man/man1/notmuch-restore.1 b/man/man1/notmuch-restore.1
index 6bba628..78fef52 100644
--- a/man/man1/notmuch-restore.1
+++ b/man/man1/notmuch-restore.1
@@ -57,10 +57,8 @@ sup calls them).
 The
 .B batch-tag
 dump format is intended to more robust against malformed message-ids
-and tags containing whitespace or non-\fBascii\fR(7) characters.  This
-format hex-escapes all characters those outside of a small character
-set, intended to be suitable for e.g. pathnames in most UNIX-like
-systems.
+and tags containing whitespace or non-\fBascii\fR(7) characters.  See
+\fBnotmuch-dump\fR(1) for details on this format.

 .B "notmuch restore"
 updates the maildir flags according to tag changes if the
-- 
1.7.10.4



[PATCH v2 4/5] dump/restore: Use Xapian queries for batch-tag format

2012-12-25 Thread Austin Clements
This switches the new batch-tag format away from using a home-grown
hex-encoding scheme for message IDs in the dump to simply using Xapian
queries with Xapian quoting syntax.

This has a variety of advantages beyond presenting a cleaner and more
consistent interface.  Foremost is that it will dramatically simplify
the quoting for batch tagging, which shares the same input format.
While the hex-encoding is no better or worse for the simple ID queries
used by dump/restore, it becomes onerous for general-purpose queries
used in batch tagging.  It also better handles strange cases like
"id:foo and bar", since this is no longer syntactically valid.
---
 notmuch-dump.c|9 +
 notmuch-restore.c |   22 ++
 tag-util.c|6 --
 test/dump-restore |   14 --
 4 files changed, 23 insertions(+), 28 deletions(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index 29d79da..bf01a39 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -20,6 +20,7 @@

 #include "notmuch-client.h"
 #include "dump-restore-private.h"
+#include "string-util.h"

 int
 notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
@@ -141,13 +142,13 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
fprintf (stderr, "Error: cannot dump message id containing line 
break: %s\n", message_id);
return 1;
}
-   if (hex_encode (notmuch, message_id,
-   , _size) != HEX_SUCCESS) {
-   fprintf (stderr, "Error: failed to hex-encode msg-id %s\n",
+   if (make_boolean_term (notmuch, "id", message_id,
+  , _size)) {
+   fprintf (stderr, "Error: failed to quote message id %s\n",
 message_id);
return 1;
}
-   fprintf (output, " -- id:%s\n", buffer);
+   fprintf (output, " -- %s\n", buffer);
}

notmuch_message_destroy (message);
diff --git a/notmuch-restore.c b/notmuch-restore.c
index 9ed9b51..77a4c27 100644
--- a/notmuch-restore.c
+++ b/notmuch-restore.c
@@ -207,7 +207,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char 
*argv[])
INTERNAL_ERROR ("compile time constant regex failed.");

 do {
-   char *query_string;
+   char *query_string, *prefix, *term;

if (line_ctx != NULL)
talloc_free (line_ctx);
@@ -220,19 +220,17 @@ notmuch_restore_command (unused (void *ctx), int argc, 
char *argv[])
  _string, tag_ops);

if (ret == 0) {
-   if (strncmp ("id:", query_string, 3) != 0) {
-   fprintf (stderr, "Warning: unsupported query: %s\n", 
query_string);
+   ret = parse_boolean_term (line_ctx, query_string,
+ , );
+   if (ret) {
+   fprintf (stderr, "Warning: cannot parse query: %s\n",
+query_string);
+   continue;
+   } else if (strcmp ("id", prefix) != 0) {
+   fprintf (stderr, "Warning: not an id query: %s\n", 
query_string);
continue;
}
-   /* delete id: from front of string; tag_message
-* expects a raw message-id.
-*
-* XXX: Note that query string id:foo and bar will be
-* interpreted as a message id "foo and bar". This
-* should eventually be fixed to give a better error
-* message.
-*/
-   query_string = query_string + 3;
+   query_string = term;
}
}

diff --git a/tag-util.c b/tag-util.c
index 705b7ba..e4e5dda 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -124,12 +124,6 @@ parse_tag_line (void *ctx, char *line,
 }

 /* tok now points to the query string */
-if (hex_decode_inplace (tok) != HEX_SUCCESS) {
-   ret = line_error (TAG_PARSE_INVALID, line_for_error,
- "hex decoding of query %s failed", tok);
-   goto DONE;
-}
-
 *query_string = tok;

   DONE:
diff --git a/test/dump-restore b/test/dump-restore
index 6a989b6..f9ae5b3 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -195,23 +195,25 @@ a

 # the previous line was blank; also no yelling please
 +%zz -- id:whatever
-+e +f id:%yy
++e +f id:"
++e +f tag:abc
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.GA7669 at griffis1.net
-# highlight the sketchy id parsing; this should be last
-+g -- id:foo and bar
+# valid id, but warning about missing message
++e id:missing_message_id
 EOF

 cat < EXPECTED
-Warning: unsupported query: a
+Warning: cannot parse query: a
 Warning: no query string [+0]
 Warning: no query string [+a +b]
 Warning: missing query string [+a +b ]
 Warning: no 

[PATCH v2 3/5] dump: Disallow \n in message IDs

2012-12-25 Thread Austin Clements
When we switch to using regular Xapian queries in the dump format, \n
will cause problems, so we disallow it.  Specially, while Xapian can
quote and parse queries containing \n without difficultly, quoted
queries containing \n still span multiple lines, which breaks the
line-orientedness of the dump format.  Strictly speaking, we could
still round-trip these, but it would significantly complicate restore
as well as scripts that deal with tag dumps.  This complexity would
come at absolutely no benefit: because of the RFC 2822 unfolding
rules, no amount of standards negligence can produce a message with a
message ID containing a line break (not even Outlook can do it!).

Hence, we simply disallow it.
---
 notmuch-dump.c   |9 +
 test/random-corpus.c |4 +++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index d2dad40..29d79da 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -132,6 +132,15 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
if (output_format == DUMP_FORMAT_SUP) {
fputs (")\n", output);
} else {
+   if (strchr (message_id, '\n')) {
+   /* This will produce a line break in the output, which
+* would be difficult to handle in tools.  However,
+* it's also impossible to produce an email containing
+* a line break in a message ID because of unfolding,
+* so we can safely disallow it. */
+   fprintf (stderr, "Error: cannot dump message id containing line 
break: %s\n", message_id);
+   return 1;
+   }
if (hex_encode (notmuch, message_id,
, _size) != HEX_SUCCESS) {
fprintf (stderr, "Error: failed to hex-encode msg-id %s\n",
diff --git a/test/random-corpus.c b/test/random-corpus.c
index f354d4b..8b7748e 100644
--- a/test/random-corpus.c
+++ b/test/random-corpus.c
@@ -96,7 +96,9 @@ random_utf8_string (void *ctx, size_t char_count)
buf = talloc_realloc (ctx, buf, gchar, buf_size);
}

-   randomchar = random_unichar ();
+   do {
+   randomchar = random_unichar ();
+   } while (randomchar == '\n');

written = g_unichar_to_utf8 (randomchar, buf + offset);

-- 
1.7.10.4



[PATCH v2 2/5] util: Function to parse boolean term queries

2012-12-25 Thread Austin Clements
This parses the subset of Xapian's boolean term quoting rules that are
used by make_boolean_term.  This is provided as a generic string
utility, but will be used shortly in notmuch restore to parse and
optimize for ID queries.
---
 util/string-util.c |   51 +++
 util/string-util.h |   11 +++
 2 files changed, 62 insertions(+)

diff --git a/util/string-util.c b/util/string-util.c
index e4bea21..db01b4b 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -96,3 +96,54 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,

 return 0;
 }
+
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out)
+{
+*prefix_out = *term_out = NULL;
+
+/* Parse prefix */
+const char *pos = strchr (str, ':');
+if (! pos)
+   goto FAIL;
+*prefix_out = talloc_strndup (ctx, str, pos - str);
+++pos;
+
+/* Implement de-quoting compatible with make_boolean_term. */
+if (*pos == '"') {
+   char *out = talloc_strdup (ctx, pos + 1);
+   int closed = 0;
+   /* Find the closing quote and un-double doubled internal
+* quotes. */
+   for (pos = *term_out = out; *pos; ) {
+   if (*pos == '"') {
+   ++pos;
+   if (*pos != '"') {
+   /* Found the closing quote. */
+   closed = 1;
+   break;
+   }
+   }
+   *out++ = *pos++;
+   }
+   /* Did the term terminate without a closing quote or is there
+* trailing text after the closing quote? */
+   if (!closed || *pos)
+   goto FAIL;
+   *out = '\0';
+} else {
+   *term_out = talloc_strdup (ctx, pos);
+   /* Check for text after the boolean term. */
+   while (*pos > ' ' && *pos != ')')
+   ++pos;
+   if (*pos)
+   goto FAIL;
+}
+return 0;
+
+ FAIL:
+talloc_free (*prefix_out);
+talloc_free (*term_out);
+return 1;
+}
diff --git a/util/string-util.h b/util/string-util.h
index 7475e2c..aff2d65 100644
--- a/util/string-util.h
+++ b/util/string-util.h
@@ -28,4 +28,15 @@ char *strtok_len (char *s, const char *delim, size_t *len);
 int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term,
   char **buf, size_t *len);

+/* Parse a boolean term query produced by make_boolean_term, returning
+ * the prefix in *prefix_out and the term in *term_out.  *prefix_out
+ * and *term_out will be talloc'd with context ctx.
+ *
+ * Return: 0 on success, non-zero on parse error (including trailing
+ * data in str).
+ */
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out);
+
 #endif
-- 
1.7.10.4



[PATCH v2 1/5] util: Factor out boolean term quoting routine

2012-12-25 Thread Austin Clements
From: Austin Clements 

This is now a generic boolean term quoting function.  It performs
minimal quoting to produce user-friendly queries.

This could live in tag-util as well, but it is really nothing specific
to tags (although the conventions are specific to Xapian).

The API is changed from "caller-allocates" to "readline-like".  The
scan for max tag length is pushed down into the quoting routine.
Furthermore, this now combines the term prefix with the quoted term;
arguably this is just as easy to do in the caller, but this will
nicely parallel the boolean term parsing function to be introduced
shortly.

This is an amalgamation of code written by David Bremner and myself.
---
 notmuch-tag.c  |   48 ---
 util/string-util.c |   64 
 util/string-util.h |9 
 3 files changed, 87 insertions(+), 34 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 88d559b..fc9d43a 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */

 #include "notmuch-client.h"
+#include "string-util.h"

 static volatile sig_atomic_t interrupted;

@@ -35,25 +36,6 @@ handle_sigint (unused (int sig))
 interrupted = 1;
 }

-static char *
-_escape_tag (char *buf, const char *tag)
-{
-const char *in = tag;
-char *out = buf;
-
-/* Boolean terms surrounded by double quotes can contain any
- * character.  Double quotes are quoted by doubling them. */
-*out++ = '"';
-while (*in) {
-   if (*in == '"')
-   *out++ = '"';
-   *out++ = *in++;
-}
-*out++ = '"';
-*out = 0;
-return buf;
-}
-
 typedef struct {
 const char *tag;
 notmuch_bool_t remove;
@@ -71,25 +53,16 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
  * parenthesize and the exclusion part of the query must not use
  * the '-' operator (though the NOT operator is fine). */

-char *escaped, *query_string;
+char *escaped = NULL;
+size_t escaped_len = 0;
+char *query_string;
 const char *join = "";
-int i;
-unsigned int max_tag_len = 0;
+size_t i;

 /* Don't optimize if there are no tag changes. */
 if (tag_ops[0].tag == NULL)
return talloc_strdup (ctx, orig_query_string);

-/* Allocate a buffer for escaping tags.  This is large enough to
- * hold a fully escaped tag with every character doubled plus
- * enclosing quotes and a NUL. */
-for (i = 0; tag_ops[i].tag; i++)
-   if (strlen (tag_ops[i].tag) > max_tag_len)
-   max_tag_len = strlen (tag_ops[i].tag);
-escaped = talloc_array (ctx, char, max_tag_len * 2 + 3);
-if (! escaped)
-   return NULL;
-
 /* Build the new query string */
 if (strcmp (orig_query_string, "*") == 0)
query_string = talloc_strdup (ctx, "(");
@@ -97,10 +70,17 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
query_string = talloc_asprintf (ctx, "( %s ) and (", orig_query_string);

 for (i = 0; tag_ops[i].tag && query_string; i++) {
+   /* XXX in case of OOM, query_string will be deallocated when
+* ctx is, which might be at shutdown */
+   if (make_boolean_term (ctx,
+  "tag", tag_ops[i].tag,
+  , _len))
+   return NULL;
+
query_string = talloc_asprintf_append_buffer (
-   query_string, "%s%stag:%s", join,
+   query_string, "%s%s%s", join,
tag_ops[i].remove ? "" : "not ",
-   _escape_tag (escaped, tag_ops[i].tag));
+   escaped);
join = " or ";
 }

diff --git a/util/string-util.c b/util/string-util.c
index 44f8cd3..e4bea21 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -20,6 +20,7 @@


 #include "string-util.h"
+#include "talloc.h"

 char *
 strtok_len (char *s, const char *delim, size_t *len)
@@ -32,3 +33,66 @@ strtok_len (char *s, const char *delim, size_t *len)

 return *len ? s : NULL;
 }
+
+int
+make_boolean_term (void *ctx, const char *prefix, const char *term,
+  char **buf, size_t *len)
+{
+const char *in;
+char *out;
+size_t needed = 3;
+int need_quoting = 0;
+
+/* Do we need quoting?  To be paranoid, we quote anything
+ * containing a quote, even though it only matters at the
+ * beginning, and anything containing non-ASCII text. */
+for (in = term; *in && !need_quoting; in++)
+   if (*in <= ' ' || *in == ')' || *in == '"' || (unsigned char)*in > 127)
+   need_quoting = 1;
+
+if (need_quoting)
+   for (in = term; *in; in++)
+   needed += (*in == '"') ? 2 : 1;
+else
+   needed = strlen (term) + 1;
+
+/* Reserve space for the prefix */
+if (prefix)
+   needed += strlen (prefix) + 1;
+
+if ((*buf == NULL) || (needed > *len)) {
+   *len = 2 * needed;
+   *buf = talloc_realloc (ctx, *buf, char, *len);
+}
+
+if (! *buf)
+   

[PATCH v2 0/5] Use Xapian query syntax for batch-tag dump/restore

2012-12-25 Thread Austin Clements
This obsoletes

  id:1356415076-5692-1-git-send-email-amdragon at mit.edu

In addition to incorporating all of David's suggestions, this reworks
the boolean term parsing so it only handles the subset of quoting
syntax used by make_boolean_term (which also happens to be all that we
described in the man page for the format).  The diff from v1 is below.

diff --git a/man/man1/notmuch-restore.1 b/man/man1/notmuch-restore.1
index 6bba628..78fef52 100644
--- a/man/man1/notmuch-restore.1
+++ b/man/man1/notmuch-restore.1
@@ -57,10 +57,8 @@ sup calls them).
 The
 .B batch-tag
 dump format is intended to more robust against malformed message-ids
-and tags containing whitespace or non-\fBascii\fR(7) characters.  This
-format hex-escapes all characters those outside of a small character
-set, intended to be suitable for e.g. pathnames in most UNIX-like
-systems.
+and tags containing whitespace or non-\fBascii\fR(7) characters.  See
+\fBnotmuch-dump\fR(1) for details on this format.

 .B "notmuch restore"
 updates the maildir flags according to tag changes if the
diff --git a/test/dump-restore b/test/dump-restore
index aecc393..f9ae5b3 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -200,6 +200,8 @@ a
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.GA7669 at griffis1.net
+# valid id, but warning about missing message
++e id:missing_message_id
 EOF

 cat < EXPECTED
@@ -211,6 +213,7 @@ Warning: no query string after -- [+c +d --]
 Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]
 Warning: cannot parse query: id:"
 Warning: not an id query: tag:abc
+Warning: cannot apply tags to missing message: missing_message_id
 EOF

 test_expect_equal_file EXPECTED OUTPUT
diff --git a/test/random-corpus.c b/test/random-corpus.c
index d0e3e8f..8b7748e 100644
--- a/test/random-corpus.c
+++ b/test/random-corpus.c
@@ -96,9 +96,9 @@ random_utf8_string (void *ctx, size_t char_count)
buf = talloc_realloc (ctx, buf, gchar, buf_size);
}

-   randomchar = random_unichar ();
-   if (randomchar == '\n')
-   randomchar = 'x';
+   do {
+   randomchar = random_unichar ();
+   } while (randomchar == '\n');

written = g_unichar_to_utf8 (randomchar, buf + offset);

diff --git a/util/string-util.c b/util/string-util.c
index eaa6c99..db01b4b 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -43,9 +43,11 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,
 size_t needed = 3;
 int need_quoting = 0;

-/* Do we need quoting? */
+/* Do we need quoting?  To be paranoid, we quote anything
+ * containing a quote, even though it only matters at the
+ * beginning, and anything containing non-ASCII text. */
 for (in = term; *in && !need_quoting; in++)
-   if (*in <= ' ' || *in == ')' || *in == '"')
+   if (*in <= ' ' || *in == ')' || *in == '"' || (unsigned char)*in > 127)
need_quoting = 1;

 if (need_quoting)
@@ -95,21 +97,6 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,
 return 0;
 }

-static int
-consume_double_quote (const char **str)
-{
-if (**str == '"') {
-   ++*str;
-   return 1;
-} else if (strncmp(*str, "\xe2\x80\x9c", 3) == 0 || /* UTF8 0x201c */
-  strncmp(*str, "\xe2\x80\x9d", 3) == 0) { /* UTF8 0x201d */
-   *str += 3;
-   return 3;
-} else {
-   return 0;
-}
-}
-
 int
 parse_boolean_term (void *ctx, const char *str,
char **prefix_out, char **term_out)
@@ -123,28 +110,31 @@ parse_boolean_term (void *ctx, const char *str,
 *prefix_out = talloc_strndup (ctx, str, pos - str);
 ++pos;

-/* Implement Xapian's boolean term de-quoting.  This is a nearly
- * direct translation of QueryParser::Internal::parse_query. */
-pos = *term_out = talloc_strdup (ctx, pos);
-if (consume_double_quote ()) {
-   char *out = talloc_strdup (ctx, pos);
-   pos = *term_out = out;
-   while (1) {
-   if (! *pos) {
-   /* Premature end of string */
-   goto FAIL;
-   } else if (*pos == '"') {
-   if (*++pos != '"')
+/* Implement de-quoting compatible with make_boolean_term. */
+if (*pos == '"') {
+   char *out = talloc_strdup (ctx, pos + 1);
+   int closed = 0;
+   /* Find the closing quote and un-double doubled internal
+* quotes. */
+   for (pos = *term_out = out; *pos; ) {
+   if (*pos == '"') {
+   ++pos;
+   if (*pos != '"') {
+   /* Found the closing quote. */
+   closed = 1;
break;
-   } else if (consume_double_quote ()) {
-   break;
+   }
}
*out++ = *pos++;
}
-   if (*pos)
+   /* Did the term terminate without a closing quote or is there
+* trailing 

[PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread Austin Clements
Quoth David Bremner on Dec 25 at 10:18 am:
> Austin Clements  writes:
> 
> > +if (consume_double_quote ()) {
> > +   char *out = talloc_strdup (ctx, pos);
> > +   pos = *term_out = out;
> > +   while (1) {
> 
> Overall the control flow here is a bit tricky to follow. I'm not sure if
> a real loop condition would help or make it worse.
> 
> > +   if (! *pos) {
> > +   /* Premature end of string */
> > +   goto FAIL;
> > +   } else if (*pos == '"') {
> > +   if (*++pos != '"')
> > +   break;
> > +   } else if (consume_double_quote ()) {
> > +   break;
> > +   }
> 
> I'm confused by the asymmetry here. Quoted strings can start with
> unicode quotes, but internally can only have ascii '"'? Is this
> documented somewhere?

Only in the source, to my knowledge.  Here's how Xapian parses these
things (where 'it' is a UTF8 string iterator):

if (it != end && is_double_quote(*it)) {
// Quoted boolean term (can contain any character).
++it;
while (it != end) {
if (*it == '"') {
// Interpret "" as an escaped ".
if (++it == end || *it != '"')
break;
} else if (is_double_quote(*it)) {
++it;
break;
}
Unicode::append_utf8(name, *it++);
}
} else {
// Can't boolean filter prefix a subexpression, so
// just use anything following the prefix until the
// next space or ')' as part of the boolean filter
// term.
while (it != end && *it > ' ' && *it != ')')
Unicode::append_utf8(name, *it++);
}

> > +} else {
> > +   while (*pos > ' ' && *pos != ')')
> > +   ++pos;
> > +   if (*pos)
> > +   goto FAIL;
> > +}
> 
> So if there is no quote, we skip the part after the ':'? I guess I
> probably missed something because that doesn't sound like the intended
> behaviour.

This isn't skipping it; it's checking its well-formedness.  In this
case, *term_out already points to a correct string that can be used
literally; we just have to check that there's no trailing garbage
after the boolean query.

This is certainly worth commenting.

For the record, I also tried passing the query straight to the
library, without parsing it in the CLI (and simply checking that the
query returned exactly one result), and it was noticeably slower (the
restore performance test took between 3.82 and 5.25 seconds for the
code in this series and ~7.2 seconds using a general query.)


[PATCH] NEWS for emacs part visibility change

2012-12-25 Thread David Bremner
Mark Walters  writes:

> Wording suggested by Austin.
> ---

pushed.

d


[PATCH 11/11] test/tagging: add test for naked punctuation in tags; compare with quoting spaces.

2012-12-25 Thread da...@tethera.net
From: David Bremner 

This test also serves as documentation of the quoting
requirements. The comment lines are so that it exactly matches the man
page. Nothing more embarrassing than having an example in the man page
fail.
---
 test/tagging |   25 +
 1 file changed, 25 insertions(+)

diff --git a/test/tagging b/test/tagging
index 1717e72..1f5632c 100755
--- a/test/tagging
+++ b/test/tagging
@@ -198,6 +198,31 @@ notmuch dump --format=batch-tag | sort > OUTPUT
 notmuch restore --format=batch-tag < BACKUP
 test_expect_equal_file EXPECTED OUTPUT

+test_begin_subtest "--batch: only space and % needs to be encoded."
+notmuch dump --format=batch-tag > BACKUP
+
+notmuch tag --batch < EXPECTED
++%23possible%5c +%26are +%28tags%29 +crazy%7b +inbox +match%2acrazy 
+space%20in%20tags +tag4 +tag5 +unread +winner -- id:msg-002 at 
notmuch-test-suite
++foo%3a%3abar%25 +found%3a%3ait +inbox +tag5 +unread +winner -- id:msg-001 at 
notmuch-test-suite
+EOF
+
+notmuch dump --format=batch-tag | sort > OUTPUT
+notmuch restore --format=batch-tag < BACKUP
+test_expect_equal_file EXPECTED OUTPUT
+
 test_begin_subtest '--batch: unicode message-ids'

 ${TEST_DIRECTORY}/random-corpus --config-path=${NOTMUCH_CONFIG} \
-- 
1.7.10.4



[PATCH 10/11] man: document notmuch tag --batch, --input options

2012-12-25 Thread da...@tethera.net
From: Jani Nikula 

---
 man/man1/notmuch-tag.1 |   92 
 1 file changed, 92 insertions(+)

diff --git a/man/man1/notmuch-tag.1 b/man/man1/notmuch-tag.1
index 9444aa4..3aa2fa5 100644
--- a/man/man1/notmuch-tag.1
+++ b/man/man1/notmuch-tag.1
@@ -6,6 +6,11 @@ notmuch-tag \- add/remove tags for all messages matching the 
search terms
 .B notmuch tag
 .RI "+<" tag ">|\-<" tag "> [...] [\-\-] <" search-term "> [...]"

+.B notmuch tag
+.RI "--batch"
+.RI "[ --input=<" filename "> ]"
+
+
 .SH DESCRIPTION

 Add/remove tags for all messages matching the search terms.
@@ -30,6 +35,93 @@ updates the maildir flags according to tag changes if the
 configuration option is enabled. See \fBnotmuch-config\fR(1) for
 details.

+Supported options for
+.B tag
+include
+.RS 4
+.TP 4
+.BR \-\-batch
+
+Read batch tagging operations from a file (stdin by default). This is more
+efficient than repeated
+.B notmuch tag
+invocations. See
+.B TAG FILE FORMAT
+below for the input format. This option is not compatible with
+specifying tagging on the command line.
+.RE
+
+.RS 4
+.TP 4
+.BR "\-\-input=" 
+
+Read input from given file, instead of from stdin. Implies
+.BR --batch .
+
+.SH TAG FILE FORMAT
+
+The input must consist of lines of the format:
+
+.RI "+<" tag ">|\-<" tag "> [...] [\-\-] <" query ">"
+
+Each line is interpreted similarly to
+.B notmuch tag
+command line arguments. The delimiter is one or more spaces ' '. Any
+characters in
+.RI < tag >
+.B may
+be hex-encoded with %NN where NN is the hexadecimal value of the
+character. To hex-encode a character with a multi-byte UTF-8 encoding,
+hex-encode each byte.
+Any spaces in 
+.B must
+be hex-encoded as %20. Any characters that are not
+part of
+.RI  < tag >
+.B must not
+be hex-encoded.
+
+In the future tag:"tag with spaces" style quoting may be supported for
+.RI < tag >
+as well;
+for this reason all double quote characters in
+.RI < tag >
+.B should
+be hex-encoded.
+
+The
+.RI < query >
+should be quoted using Xapian boolean term quoting rules: if a term
+contains whitespace or a close paren or starts with a double quote, it
+must be enclosed in double quotes (not including any prefix) and
+double quotes inside the term must be doubled (see below for
+examples).
+
+Leading and trailing space ' ' is ignored. Empty lines and lines
+beginning with '#' are ignored.
+
+.SS EXAMPLE
+
+The following shows a valid input to batch tagging. Note that only the
+isolated '*' acts as a wildcard. Also note the two different quotings
+of the tag
+.B space in tags
+.
+.RS
+.nf
++winner *
++foo::bar%25 -- (One and Two) or (One and tag:winner)
++found::it -- tag:foo::bar%
+# ignore this line and the next
+
++space%20in%20tags -- Two
+# add tag '(tags)', among other stunts.
++crazy{ +(tags) + +#possible\ -- tag:"space in tags"
++match*crazy -- tag:crazy{
++some_tag -- id:"this is ""nauty)"""
+.fi
+.RE
+
 .SH SEE ALSO

 \fBnotmuch\fR(1), \fBnotmuch-config\fR(1), \fBnotmuch-count\fR(1),
-- 
1.7.10.4



[PATCH 09/11] notmuch-tag.1: tidy synopsis formatting, reference

2012-12-25 Thread da...@tethera.net
From: David Bremner 

Consistently use [...]; one less space. Use singular 
---
 man/man1/notmuch-tag.1 |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/man/man1/notmuch-tag.1 b/man/man1/notmuch-tag.1
index 0f86582..9444aa4 100644
--- a/man/man1/notmuch-tag.1
+++ b/man/man1/notmuch-tag.1
@@ -4,20 +4,21 @@ notmuch-tag \- add/remove tags for all messages matching the 
search terms

 .SH SYNOPSIS
 .B notmuch tag
-.RI  "+<" tag> "|\-<" tag "> [...] [\-\-] <" search-term ">..."
+.RI "+<" tag ">|\-<" tag "> [...] [\-\-] <" search-term "> [...]"

 .SH DESCRIPTION

 Add/remove tags for all messages matching the search terms.

 See \fBnotmuch-search-terms\fR(7)
-for details of the supported syntax for .
+for details of the supported syntax for
+.RI < search-term >.

 Tags prefixed by '+' are added while those prefixed by '\-' are
 removed. For each message, tag removal is performed before tag
 addition.

-The beginning of  is recognized by the first
+The beginning of the search terms is recognized by the first
 argument that begins with neither '+' nor '\-'. Support for
 an initial search term beginning with '+' or '\-' is provided
 by allowing the user to specify a "\-\-" argument to separate
-- 
1.7.10.4



[PATCH 08/11] test/tagging: add test for exotic message-ids and batch tagging

2012-12-25 Thread da...@tethera.net
From: David Bremner 

The (now fixed) bug that this test revealed is that unquoted
message-ids with whitespace or other control characters in them are
split into several tokens by the Xapian query parser.
---
 test/tagging |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/test/tagging b/test/tagging
index 417112b..1717e72 100755
--- a/test/tagging
+++ b/test/tagging
@@ -198,6 +198,24 @@ notmuch dump --format=batch-tag | sort > OUTPUT
 notmuch restore --format=batch-tag < BACKUP
 test_expect_equal_file EXPECTED OUTPUT

+test_begin_subtest '--batch: unicode message-ids'
+
+${TEST_DIRECTORY}/random-corpus --config-path=${NOTMUCH_CONFIG} \
+ --num-messages=100
+
+notmuch dump --format=batch-tag | sed 's/^.* -- /+common_tag -- /' | \
+sort > EXPECTED
+
+notmuch dump --format=batch-tag | sed 's/^.* -- /  -- /' | \
+notmuch restore --format=batch-tag
+
+notmuch tag --batch < EXPECTED
+
+notmuch dump --format=batch-tag| \
+sort > OUTPUT
+
+test_expect_equal_file EXPECTED OUTPUT
+
 test_expect_code 1 "Empty tag names" 'notmuch tag + One'

 test_expect_code 1 "Tag name beginning with -" 'notmuch tag +- One'
-- 
1.7.10.4



[PATCH 07/11] test/tagging: add tests for exotic tags

2012-12-25 Thread da...@tethera.net
From: David Bremner 

We test quotes seperately because they matter to the query escaper.
---
 test/tagging |   66 ++
 1 file changed, 66 insertions(+)

diff --git a/test/tagging b/test/tagging
index 405ad7c..417112b 100755
--- a/test/tagging
+++ b/test/tagging
@@ -132,6 +132,72 @@ EOF
 notmuch restore --format=batch-tag < BACKUP
 test_expect_equal_file EXPECTED OUTPUT

+test_begin_subtest '--batch: tags with quotes'
+notmuch dump --format=batch-tag > BACKUP
+
+notmuch tag --batch < EXPECTED
++%22%27%22%22%22%27 +inbox +tag5 +unread -- id:msg-001 at notmuch-test-suite
++%22%27%22%27%22%22%27%27 +inbox +tag4 +tag5 +unread -- id:msg-002 at 
notmuch-test-suite
+EOF
+
+notmuch dump --format=batch-tag | sort > OUTPUT
+notmuch restore --format=batch-tag < BACKUP
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest '--batch: tags with punctuation and space'
+notmuch dump --format=batch-tag > BACKUP
+
+notmuch tag --batch < EXPECTED
++%21@%23%20%24%25%5e%26%2a%29-_=+%5b%7b%5c%20%7c%3b%3a%27%20%22,.%3c%60%7e 
+inbox +tag4 +tag5 +unread -- id:msg-002 at notmuch-test-suite
++%21@%23%20%24%25%5e%26%2a%29-_=+%5b%7b%5c%20%7c%3b%3a%27%20%22,.%3c%60%7e 
+inbox +tag5 +unread -- id:msg-001 at notmuch-test-suite
+EOF
+
+notmuch dump --format=batch-tag | sort > OUTPUT
+notmuch restore --format=batch-tag < BACKUP
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest '--batch: unicode tags'
+notmuch dump --format=batch-tag > BACKUP
+
+notmuch tag --batch < EXPECTED
++%2a@%7d%cf%b5%f4%85%80%adO3%da%a7 
+=%e0%ac%95%c8%b3+%ef%aa%95%c8%a64w%c7%9d%c9%a2%cf%b3%d6%82%24B%c4%a9%c5%a1UX%ee%99%b0%27E7%ca%a4%d0%8b%5d
 
+A%e1%a0%bc%de%8b%d5%b2V%d9%9b%f3%b5%a2%a3M%d8%a1u@%f0%a0%ac%948%7e%f0%ab%86%af%27
 +L%df%85%ef%a1%a5m@%d3%96%c2%ab%d4%9f%ca%b8%f3%b3%a2%bf%c7%b1_u%d7%b4%c7%b1 
+P%c4%98%2f +R +inbox +tag4 +tag5 +unread 
+%7e%d1%8b%25%ec%a0%ae%d1%a0M%3b%e3%b6%b7%e9%a4%87%3c%db%9a%cc%a8%e1%96%9d 
+%c4%bf7%c7%ab9H%c4%99k%ea%91%bd%c3%8ck%e2%b3%8dk%c5%952V%e4%99%b2%d9%b3%e4%8b%bda%5b%24%c7%9b
 +%da%88=f%cc%b9I%ce%af%7b%c9%97%e3%b9%8bH%cb%92X%d2%8c6 
+%dc%9crh%d2%86B%e5%97%a2%22t%ed%99%82d -- id:msg-002 at notmuch-test-suite
++%2a@%7d%cf%b5%f4%85%80%adO3%da%a7 
+=%e0%ac%95%c8%b3+%ef%aa%95%c8%a64w%c7%9d%c9%a2%cf%b3%d6%82%24B%c4%a9%c5%a1UX%ee%99%b0%27E7%ca%a4%d0%8b%5d
 
+A%e1%a0%bc%de%8b%d5%b2V%d9%9b%f3%b5%a2%a3M%d8%a1u@%f0%a0%ac%948%7e%f0%ab%86%af%27
 +L%df%85%ef%a1%a5m@%d3%96%c2%ab%d4%9f%ca%b8%f3%b3%a2%bf%c7%b1_u%d7%b4%c7%b1 
+P%c4%98%2f +R +inbox +tag5 +unread 
+%7e%d1%8b%25%ec%a0%ae%d1%a0M%3b%e3%b6%b7%e9%a4%87%3c%db%9a%cc%a8%e1%96%9d 
+%c4%bf7%c7%ab9H%c4%99k%ea%91%bd%c3%8ck%e2%b3%8dk%c5%952V%e4%99%b2%d9%b3%e4%8b%bda%5b%24%c7%9b
 +%da%88=f%cc%b9I%ce%af%7b%c9%97%e3%b9%8bH%cb%92X%d2%8c6 
+%dc%9crh%d2%86B%e5%97%a2%22t%ed%99%82d -- id:msg-001 at notmuch-test-suite
+EOF
+
+notmuch dump --format=batch-tag | sort > OUTPUT
+notmuch restore --format=batch-tag < BACKUP
+test_expect_equal_file EXPECTED OUTPUT
+
 test_expect_code 1 "Empty tag names" 'notmuch tag + One'

 test_expect_code 1 "Tag name beginning with -" 'notmuch tag +- One'
-- 
1.7.10.4



[PATCH 06/11] test/tagging: add basic tests for batch tagging functionality

2012-12-25 Thread da...@tethera.net
From: David Bremner 

This tests argument parsing, blank lines and comments, and basic hex
decoding functionality.
---
 test/tagging |   51 +++
 1 file changed, 51 insertions(+)

diff --git a/test/tagging b/test/tagging
index cd16585..405ad7c 100755
--- a/test/tagging
+++ b/test/tagging
@@ -46,6 +46,57 @@ test_expect_equal "$output" "\
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (:\"  inbox tag1 unread)
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag1 tag4 unread)"

+test_begin_subtest "--batch"
+notmuch tag --batch < batch.in  < batch.expected < backup.tags
+notmuch tag --input=batch.in
+notmuch search \* | notmuch_search_sanitize > OUTPUT
+notmuch restore --format=batch-tag < backup.tags
+test_expect_equal_file batch.expected OUTPUT
+
+test_begin_subtest "--batch --input"
+notmuch dump --format=batch-tag > backup.tags
+notmuch tag --batch --input=batch.in
+notmuch search \* | notmuch_search_sanitize > OUTPUT
+notmuch restore --format=batch-tag < backup.tags
+test_expect_equal_file batch.expected OUTPUT
+
+test_begin_subtest "--batch, blank lines and comments"
+notmuch dump | sort > EXPECTED
+notmuch tag --batch < OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
 test_begin_subtest '--batch: checking error messages'
 notmuch dump --format=batch-tag > BACKUP
 notmuch tag --batch 

[PATCH 05/11] test/tagging: add test for error messages of tag --batch

2012-12-25 Thread da...@tethera.net
From: David Bremner 

This is based on the similar test for notmuch restore, but the parser
in batch tagging mode is less tolerant of a few cases, in particular
those tested by illegal_tag.
---
 test/tagging |   35 +++
 1 file changed, 35 insertions(+)

diff --git a/test/tagging b/test/tagging
index 980ff92..cd16585 100755
--- a/test/tagging
+++ b/test/tagging
@@ -46,6 +46,41 @@ test_expect_equal "$output" "\
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (:\"  inbox tag1 unread)
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag1 tag4 unread)"

+test_begin_subtest '--batch: checking error messages'
+notmuch dump --format=batch-tag > BACKUP
+notmuch tag --batch 

[PATCH 04/11] cli: add support for batch tagging operations to "notmuch tag"

2012-12-25 Thread da...@tethera.net
From: Jani Nikula 

Add support for batch tagging operations through stdin to "notmuch
tag". This can be enabled with the new --batch command line option to
"notmuch tag". The input must consist of lines of the format:

+|- [...] [--]  [...]

Each line is interpreted similarly to "notmuch tag" command line
arguments. The delimiter is one or more spaces ' '. Any characters in
 MAP be hex encoded with %NN where NN is the
hexadecimal value of the character. Any ' ' and '%' characters in
 and  MUST be hex encoded (using %20 and %25,
respectively). Any characters that are not part of  or
MUST NOT be hex encoded.

 is passed verbatim to Xapian

Leading and trailing space ' ' is ignored. Empty lines and lines
beginning with '#' are ignored.

Signed-off-by: Jani Nikula 

Hacked-like-crazy-by: David Bremner 
---
 notmuch-tag.c |   94 -
 1 file changed, 86 insertions(+), 8 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 8129912..7fc614d 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -128,6 +128,46 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const 
char *query_string,
 return interrupted;
 }

+static int
+tag_file (void *ctx, notmuch_database_t *notmuch, tag_op_flag_t flags,
+ FILE *input)
+{
+char *line = NULL;
+char *query_string = NULL;
+size_t line_size = 0;
+ssize_t line_len;
+int ret = 0;
+tag_op_list_t *tag_ops;
+
+tag_ops = tag_op_list_create (ctx);
+if (tag_ops == NULL) {
+   fprintf (stderr, "Out of memory.\n");
+   return 1;
+}
+
+while ((line_len = getline (, _size, input)) != -1 &&
+  ! interrupted) {
+
+   ret = parse_tag_line (ctx, line, TAG_FLAG_NONE,
+ _string, tag_ops);
+
+   if (ret > 0)
+   continue;
+
+   if (ret < 0)
+   break;
+
+   ret = tag_query (ctx, notmuch, query_string, tag_ops, flags);
+   if (ret)
+   break;
+}
+
+if (line)
+   free (line);
+
+return ret;
+}
+
 int
 notmuch_tag_command (void *ctx, int argc, char *argv[])
 {
@@ -137,6 +177,10 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 notmuch_database_t *notmuch;
 struct sigaction action;
 tag_op_flag_t tag_flags = TAG_FLAG_NONE;
+notmuch_bool_t batch = FALSE;
+FILE *input = stdin;
+char *input_file_name = NULL;
+int opt_index;
 int ret = 0;

 /* Setup our handler for SIGINT */
@@ -146,15 +190,43 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 action.sa_flags = SA_RESTART;
 sigaction (SIGINT, , NULL);

-tag_ops = tag_op_list_create (ctx);
-if (tag_ops == NULL) {
-   fprintf (stderr, "Out of memory.\n");
+notmuch_opt_desc_t options[] = {
+   { NOTMUCH_OPT_BOOLEAN, , "batch", 0, 0 },
+   { NOTMUCH_OPT_STRING, _file_name, "input", 'i', 0 },
+   { 0, 0, 0, 0, 0 }
+};
+
+opt_index = parse_arguments (argc, argv, options, 1);
+if (opt_index < 0)
return 1;
+
+if (input_file_name) {
+   batch = TRUE;
+   input = fopen (input_file_name, "r");
+   if (input == NULL) {
+   fprintf (stderr, "Error opening %s for reading: %s\n",
+input_file_name, strerror (errno));
+   return 1;
+   }
 }

-if (parse_tag_command_line (ctx, argc - 1, argv + 1,
-   _string, tag_ops))
-   return 1;
+if (batch) {
+   if (opt_index != argc) {
+   fprintf (stderr, "Can't specify both cmdline and stdin!\n");
+   return 1;
+   }
+} else {
+
+   tag_ops = tag_op_list_create (ctx);
+   if (tag_ops == NULL) {
+   fprintf (stderr, "Out of memory.\n");
+   return 1;
+   }
+
+   if (parse_tag_command_line (ctx, argc - opt_index, argv + opt_index,
+   _string, tag_ops))
+   return 1;
+}

 config = notmuch_config_open (ctx, NULL, NULL);
 if (config == NULL)
@@ -167,9 +239,15 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 if (notmuch_config_get_maildir_synchronize_flags (config))
tag_flags |= TAG_FLAG_MAILDIR_SYNC;

-ret = tag_query (ctx, notmuch, query_string, tag_ops, tag_flags);
+if (batch)
+   ret = tag_file (ctx, notmuch, tag_flags, input);
+else
+   ret = tag_query (ctx, notmuch, query_string, tag_ops, tag_flags);

 notmuch_database_destroy (notmuch);

-return ret;
+if (input != stdin)
+   fclose (input);
+
+return ret || interrupted;
 }
-- 
1.7.10.4



[PATCH 03/11] notmuch-tag.c: convert to use tag-utils

2012-12-25 Thread da...@tethera.net
From: David Bremner 

Command line parsing is factored out into a function
parse_tag_command_line in tag-utils.c.

There is some duplicated code eliminated in tag_query, and a bunch of
translation from using the bare tag_op structs to using that tag-utils
API.
---
 notmuch-tag.c |   99 -
 tag-util.c|   51 +++--
 tag-util.h|   15 +
 3 files changed, 84 insertions(+), 81 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index fc9d43a..8129912 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */

 #include "notmuch-client.h"
+#include "tag-util.h"
 #include "string-util.h"

 static volatile sig_atomic_t interrupted;
@@ -36,14 +37,10 @@ handle_sigint (unused (int sig))
 interrupted = 1;
 }

-typedef struct {
-const char *tag;
-notmuch_bool_t remove;
-} tag_operation_t;

 static char *
 _optimize_tag_query (void *ctx, const char *orig_query_string,
-const tag_operation_t *tag_ops)
+const tag_op_list_t *list)
 {
 /* This is subtler than it looks.  Xapian ignores the '-' operator
  * at the beginning both queries and parenthesized groups and,
@@ -60,7 +57,7 @@ _optimize_tag_query (void *ctx, const char *orig_query_string,
 size_t i;

 /* Don't optimize if there are no tag changes. */
-if (tag_ops[0].tag == NULL)
+if (tag_op_list_size (list) == 0)
return talloc_strdup (ctx, orig_query_string);

 /* Build the new query string */
@@ -69,17 +66,17 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
 else
query_string = talloc_asprintf (ctx, "( %s ) and (", orig_query_string);

-for (i = 0; tag_ops[i].tag && query_string; i++) {
+for (i = 0; i < tag_op_list_size (list) && query_string; i++) {
/* XXX in case of OOM, query_string will be deallocated when
 * ctx is, which might be at shutdown */
if (make_boolean_term (ctx,
-  "tag", tag_ops[i].tag,
+  "tag", tag_op_list_tag (list, i),
   , _len))
return NULL;

query_string = talloc_asprintf_append_buffer (
query_string, "%s%s%s", join,
-   tag_ops[i].remove ? "" : "not ",
+   tag_op_list_isremove (list, i) ? "" : "not ",
escaped);
join = " or ";
 }
@@ -91,17 +88,15 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
 return query_string;
 }

-/* Tag messages matching 'query_string' according to 'tag_ops', which
- * must be an array of tagging operations terminated with an empty
- * element. */
+/* Tag messages matching 'query_string' according to 'tag_ops'
+ */
 static int
 tag_query (void *ctx, notmuch_database_t *notmuch, const char *query_string,
-  tag_operation_t *tag_ops, notmuch_bool_t synchronize_flags)
+  tag_op_list_t *tag_ops, tag_op_flag_t flags)
 {
 notmuch_query_t *query;
 notmuch_messages_t *messages;
 notmuch_message_t *message;
-int i;

 /* Optimize the query so it excludes messages that already have
  * the specified set of tags. */
@@ -124,21 +119,7 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const 
char *query_string,
 notmuch_messages_valid (messages) && ! interrupted;
 notmuch_messages_move_to_next (messages)) {
message = notmuch_messages_get (messages);
-
-   notmuch_message_freeze (message);
-
-   for (i = 0; tag_ops[i].tag; i++) {
-   if (tag_ops[i].remove)
-   notmuch_message_remove_tag (message, tag_ops[i].tag);
-   else
-   notmuch_message_add_tag (message, tag_ops[i].tag);
-   }
-
-   notmuch_message_thaw (message);
-
-   if (synchronize_flags)
-   notmuch_message_tags_to_maildir_flags (message);
-
+   tag_op_list_apply (message, tag_ops, flags | TAG_FLAG_PRE_OPTIMIZED);
notmuch_message_destroy (message);
 }

@@ -150,15 +131,13 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const 
char *query_string,
 int
 notmuch_tag_command (void *ctx, int argc, char *argv[])
 {
-tag_operation_t *tag_ops;
-int tag_ops_count = 0;
-char *query_string;
+tag_op_list_t *tag_ops = NULL;
+char *query_string = NULL;
 notmuch_config_t *config;
 notmuch_database_t *notmuch;
 struct sigaction action;
-notmuch_bool_t synchronize_flags;
-int i;
-int ret;
+tag_op_flag_t tag_flags = TAG_FLAG_NONE;
+int ret = 0;

 /* Setup our handler for SIGINT */
 memset (, 0, sizeof (struct sigaction));
@@ -167,54 +146,15 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 action.sa_flags = SA_RESTART;
 sigaction (SIGINT, , NULL);

-argc--; argv++; /* skip subcommand argument */
-
-/* Array of tagging operations (add or remove), terminated with an
- * empty element. */
-tag_ops = 

[PATCH 02/11] tag-util: factor out rules for illegal tags, use in parse_tag_line

2012-12-25 Thread da...@tethera.net
From: David Bremner 

This will allow us to be consistent between batch tagging and command
line tagging as far as what is an illegal tag.
---
 tag-util.c |   36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/tag-util.c b/tag-util.c
index ca12b3b..0a4fe78 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -31,6 +31,30 @@ line_error (tag_parse_status_t status,
 return status;
 }

+/*
+ * Test tags for some forbidden cases.
+ *
+ * return: NULL if OK,
+ *explanatory message otherwise.
+ */
+
+static const char *
+illegal_tag (const char *tag, notmuch_bool_t remove)
+{
+
+if (*tag == '\0' && ! remove)
+   return "empty tag forbidden";
+
+/* This disallows adding the non-removable tag "-" and
+ * enables notmuch tag to take long options more easily.
+ */
+
+if (*tag == '-' && ! remove)
+   return "tag starting with '-' forbidden";
+
+return NULL;
+}
+
 tag_parse_status_t
 parse_tag_line (void *ctx, char *line,
tag_op_flag_t flags,
@@ -95,11 +119,13 @@ parse_tag_line (void *ctx, char *line,
remove = (*tok == '-');
tag = tok + 1;

-   /* Maybe refuse empty tags. */
-   if (! (flags & TAG_FLAG_BE_GENEROUS) && *tag == '\0') {
-   ret = line_error (TAG_PARSE_INVALID, line_for_error,
- "empty tag");
-   goto DONE;
+   /* Maybe refuse illegal tags. */
+   if (! (flags & TAG_FLAG_BE_GENEROUS)) {
+   const char *msg = illegal_tag (tag, remove);
+   if (msg) {
+   ret = line_error (TAG_PARSE_INVALID, line_for_error, msg);
+   goto DONE;
+   }
}

/* Decode tag. */
-- 
1.7.10.4



[PATCH 01/11] parse_tag_line: use enum for return value.

2012-12-25 Thread da...@tethera.net
From: David Bremner 

This is essentially cosmetic, since success=0 is promised by
the comments in tag-utils.h.
---
 tag-util.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tag-util.c b/tag-util.c
index e4e5dda..ca12b3b 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -40,14 +40,14 @@ parse_tag_line (void *ctx, char *line,
 char *tok = line;
 size_t tok_len = 0;
 char *line_for_error;
-int ret = 0;
+tag_parse_status_t ret = TAG_PARSE_SUCCESS;

 chomp_newline (line);

 line_for_error = talloc_strdup (ctx, line);
 if (line_for_error == NULL) {
fprintf (stderr, "Error: out of memory\n");
-   return -1;
+   return TAG_PARSE_OUT_OF_MEMORY;
 }

 /* remove leading space */
-- 
1.7.10.4



Xapian-quoting based batch-tagging.

2012-12-25 Thread da...@tethera.net
This is an alternative version of 

 id:1356313183-9266-1-git-send-email-david at tethera.net

batch tagging patches rebased on top of 

 id:1356415076-5692-1-git-send-email-amdragon at mit.edu

This mainly consisted of removing 

 [Patch v9 04/17] notmuch-tag: factor out double quoting routine
 (superceded by one of Austin's patches)

 [Patch v9 05/17] util/string-util: add a new string tokenized function
 [Patch v9 06/17] unhex_and_quote: new function to quote hex-decoded queries
 [Patch v9 07/17] notmuch-restore: move query handling for batch
 (uneeded if query is passed verbatim to xapian)

 I also removed two tests, since they are about how we handle
 quoting:

 [Patch v9 13/17] test/tagging: add test for compound queries with batch 
tagging
 [Patch v9 17/17] test/tagging: add test for handling of parenthesized  
tag queries.

A few small fixes were needed to the tests, and a fair amount of
changes to the notmuch-tag man page.

Diffstat (against Austin's series) is as follows

 man/man1/notmuch-tag.1 |   99 -
 notmuch-tag.c  |  169 
 tag-util.c |   87 +--
 tag-util.h |   15 
 test/tagging   |  195 ++
 5 files changed, 480 insertions(+), 85 deletions(-)




[PATCH] contrib: pick: close message pane when quitting from show in the message pane

2012-12-25 Thread Tomi Ollila
On Tue, Dec 25 2012, Mark Walters  wrote:

> We add a hook to the show buffer in the message window to close the
> message window when that buffer quits.  It checks that the
> message-window is still displaying the show-message buffer and then
> closes it.
> ---
> This is just a slightly tidier version of the previous patch. It is
> more natural to reuse the nomuch-pick-message-window variable (as it
> serves the same pupose) rather than adding a new variable and
> cluttering up the name space (and it also keeps the clutter in the
> notmuch-pick namespace).
>
> I think this a good candidate for 0.15: it fixes an annoying bug in pick. 
>
> Best wishes
>
> Mark
>
>
>  contrib/notmuch-pick/notmuch-pick.el |   19 +++
>  1 files changed, 19 insertions(+), 0 deletions(-)
>
> diff --git a/contrib/notmuch-pick/notmuch-pick.el 
> b/contrib/notmuch-pick/notmuch-pick.el
> index 1a553d4..7d01f4c 100644
> --- a/contrib/notmuch-pick/notmuch-pick.el
> +++ b/contrib/notmuch-pick/notmuch-pick.el
> @@ -157,6 +157,10 @@
>  (make-variable-buffer-local 'notmuch-pick-query-context)
>  (defvar notmuch-pick-buffer-name nil)
>  (make-variable-buffer-local 'notmuch-pick-buffer-name)
> +;; This variable is the window used for the message pane. It is set
> +;; in both the parent pick buffer and the child show buffer. It is
> +;; used to try and close the message pane when quitting pick or the
> +;; child show buffer.
>  (defvar notmuch-pick-message-window nil)
>  (make-variable-buffer-local 'notmuch-pick-message-window)
>  (put 'notmuch-pick-message-window 'permanent-local t)
> @@ -324,6 +328,16 @@ Does NOT change the database."
>  (notmuch-prettify-subject (notmuch-search-find-subject)))
>(notmuch-pick-show-match-message-with-wait))
>  
> +(defun notmuch-pick-message-window-kill-hook ()
> +  (let ((buffer (current-buffer)))
> +(when (and (window-live-p notmuch-pick-message-window)
> +(eq (window-buffer notmuch-pick-message-window) buffer))
> +  ;; We do not want an error if this is the sole window in the
> +  ;; frame and I do not know how to test for that in emacs pre
> +  ;; 24. Hence we just ignore-errors.
> +  (ignore-errors
> + (delete-window notmuch-pick-message-window)
> +
>  (defun notmuch-pick-show-message ()
>"Show the current message (in split-pane)."
>(interactive)
> @@ -341,6 +355,11 @@ Does NOT change the database."
>   (let ((notmuch-show-indent-messages-width 0))
> (setq current-prefix-arg '(4))
> (setq buffer (notmuch-show id nil nil nil
> +  ;; We need the `let' as notmuch-pick-message-window is buffer local.
> +  (let ((window notmuch-pick-message-window))
> + (with-current-buffer buffer
> +   (setq notmuch-pick-message-window window)
> +   (add-hook 'kill-buffer-hook 'notmuch-pick-message-window-kill-hook)))
>(notmuch-pick-tag-update-display (list "-unread"))
>(setq notmuch-pick-message-buffer buffer


LGTM. 

Tomi

PS: (unrelated) Mark: check if notmuch-show-mark-read-tags should be used in
   line:
   (notmuch-pick-tag-update-display (list "-unread"))

> -- 
> 1.7.9.1


[PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread David Bremner
Austin Clements  writes:

> ---
>  man/man1/notmuch-dump.1 |   11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
>

I guess the short description in notmuch-restore.1 needs updating as
well.



[PATCH] contrib: pick: close message pane when quitting from show in the message pane

2012-12-25 Thread Mark Walters
We add a hook to the show buffer in the message window to close the
message window when that buffer quits.  It checks that the
message-window is still displaying the show-message buffer and then
closes it.
---
This is just a slightly tidier version of the previous patch. It is
more natural to reuse the nomuch-pick-message-window variable (as it
serves the same pupose) rather than adding a new variable and
cluttering up the name space (and it also keeps the clutter in the
notmuch-pick namespace).

I think this a good candidate for 0.15: it fixes an annoying bug in pick. 

Best wishes

Mark


 contrib/notmuch-pick/notmuch-pick.el |   19 +++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/contrib/notmuch-pick/notmuch-pick.el 
b/contrib/notmuch-pick/notmuch-pick.el
index 1a553d4..7d01f4c 100644
--- a/contrib/notmuch-pick/notmuch-pick.el
+++ b/contrib/notmuch-pick/notmuch-pick.el
@@ -157,6 +157,10 @@
 (make-variable-buffer-local 'notmuch-pick-query-context)
 (defvar notmuch-pick-buffer-name nil)
 (make-variable-buffer-local 'notmuch-pick-buffer-name)
+;; This variable is the window used for the message pane. It is set
+;; in both the parent pick buffer and the child show buffer. It is
+;; used to try and close the message pane when quitting pick or the
+;; child show buffer.
 (defvar notmuch-pick-message-window nil)
 (make-variable-buffer-local 'notmuch-pick-message-window)
 (put 'notmuch-pick-message-window 'permanent-local t)
@@ -324,6 +328,16 @@ Does NOT change the database."
 (notmuch-prettify-subject (notmuch-search-find-subject)))
   (notmuch-pick-show-match-message-with-wait))

+(defun notmuch-pick-message-window-kill-hook ()
+  (let ((buffer (current-buffer)))
+(when (and (window-live-p notmuch-pick-message-window)
+  (eq (window-buffer notmuch-pick-message-window) buffer))
+  ;; We do not want an error if this is the sole window in the
+  ;; frame and I do not know how to test for that in emacs pre
+  ;; 24. Hence we just ignore-errors.
+  (ignore-errors
+   (delete-window notmuch-pick-message-window)
+
 (defun notmuch-pick-show-message ()
   "Show the current message (in split-pane)."
   (interactive)
@@ -341,6 +355,11 @@ Does NOT change the database."
(let ((notmuch-show-indent-messages-width 0))
  (setq current-prefix-arg '(4))
  (setq buffer (notmuch-show id nil nil nil
+  ;; We need the `let' as notmuch-pick-message-window is buffer local.
+  (let ((window notmuch-pick-message-window))
+   (with-current-buffer buffer
+ (setq notmuch-pick-message-window window)
+ (add-hook 'kill-buffer-hook 'notmuch-pick-message-window-kill-hook)))
   (notmuch-pick-tag-update-display (list "-unread"))
   (setq notmuch-pick-message-buffer buffer

-- 
1.7.9.1



[PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread David Bremner
David Bremner  writes:
>
> There is a certain cognitive dissonance in using hex-quoting for tags
> and xapian-quoting for message-ids.  Did you think about the
> implications of using xapian quoting for tags as well?
>

At risk of talking to myself, I guess one loses the space delimited
lists of tags, which is probably pretty handy for scripting.

d


[PATCH] NEWS for emacs part visibility change

2012-12-25 Thread Mark Walters
Wording suggested by Austin.
---

I have taken Austin's (excellent) wording and agree with the
downplaying of notmuch-show-all-multipart/alternative-parts.

Best wishes

Mark


 NEWS |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/NEWS b/NEWS
index 0de685a..69f21a0 100644
--- a/NEWS
+++ b/NEWS
@@ -70,6 +70,17 @@ Removal of the deprecated `notmuch-folders` variable
   has now been removed. Any remaining users should migrate to
   `notmuch-saved-searches`.

+Visibility of MIME parts can be toggled
+
+  Each part of a multi-part MIME email can now be shown or hidden
+  using the button at the top of each part (by pressing RET on it or
+  by clicking).  For emails with multiple alternative formats (e.g.,
+  plain text and HTML), only the preferred format is shown initially,
+  but other formats can be shown using their part buttons.  To control
+  the behavior of this, see
+  `notmuch-multipart/alternative-discouraged` and
+  `notmuch-show-all-multipart/alternative-parts`.
+
 Emacs now buttonizes mid: links

   mid: links are a standardized way to link to messages by message ID
-- 
1.7.9.1



[PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread David Bremner

Overall I agree this is conceptually cleaner. Transforming between
quoting formats is inherently delicate; I suspect there will always be
at least one special case missed.  I did find the dequoting code a bit
baffling, but this is more of an implementation issue.

There is a certain cognitive dissonance in using hex-quoting for tags
and xapian-quoting for message-ids.  Did you think about the
implications of using xapian quoting for tags as well?

d


[PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread David Bremner
David Bremner  writes:
>
> So if there is no quote, we skip the part after the ':'? I guess I
> probably missed something because that doesn't sound like the intended
> behaviour.

Indeed the following addition to the test shows it works fine in
context. So I guess I just don't follow this control flow very well.

diff --git a/test/dump-restore b/test/dump-restore
index aecc393..f9ae5b3 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -200,6 +200,8 @@ a
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.GA7669 at griffis1.net
+# valid id, but warning about missing message
++e id:missing_message_id
 EOF

 cat < EXPECTED
@@ -211,6 +213,7 @@ Warning: no query string after -- [+c +d --]
 Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]
 Warning: cannot parse query: id:"
 Warning: not an id query: tag:abc
+Warning: cannot apply tags to missing message: missing_message_id
 EOF

 test_expect_equal_file EXPECTED OUTPUT


[PATCH 3/5] dump: Disallow \n in message IDs

2012-12-25 Thread David Bremner
Austin Clements  writes:
>  
>   randomchar = random_unichar ();
> + if (randomchar == '\n')
> + randomchar = 'x';
>  

what about something like 

do {
   randomchar = random_unichar ();
}  while (randomchar == '\n');


[PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread David Bremner
Austin Clements  writes:

> +if (consume_double_quote ()) {
> + char *out = talloc_strdup (ctx, pos);
> + pos = *term_out = out;
> + while (1) {

Overall the control flow here is a bit tricky to follow. I'm not sure if
a real loop condition would help or make it worse.

> + if (! *pos) {
> + /* Premature end of string */
> + goto FAIL;
> + } else if (*pos == '"') {
> + if (*++pos != '"')
> + break;
> + } else if (consume_double_quote ()) {
> + break;
> + }

I'm confused by the asymmetry here. Quoted strings can start with
unicode quotes, but internally can only have ascii '"'? Is this
documented somewhere?

> +} else {
> + while (*pos > ' ' && *pos != ')')
> + ++pos;
> + if (*pos)
> + goto FAIL;
> +}

So if there is no quote, we skip the part after the ':'? I guess I
probably missed something because that doesn't sound like the intended
behaviour.


v2 of valgrind based memory tests

2012-12-25 Thread David Bremner
Austin Clements  writes:

> On Mon, 24 Dec 2012, david at tethera.net wrote:
>
> LGTM.  In the long run, I wonder if we want to separate the time and
> memory tests like this or if some tests would make sense in either mode.

pushed. At the moment, the split into time versus memory tests is a
simple way to separate out "heavy weight" tests, but this could be done
more explicitely.

d


[PATCH 1/5] util: Factor out boolean term quoting routine

2012-12-25 Thread David Bremner
Austin Clements  writes:
>
> This could live in tag-util as well, but it is really nothing specific
> to tags (although the conventions are specific to Xapian).
>
> Furthermore, this now combines the term prefix with the quoted term;
> arguably this is just as easy to do in the caller, but this will
> nicely parallel the boolean term parsing function to be introduced
> shortly.

At first glance, I found this a bit too notmuch-specific to go in
util. On second glance, I found my first reaction somewhat
bizarre. Perhaps at some point we should drop a README file in util
explaining what should go there.

Other than that, LGTM
d


[PATCH] NEWS for emacs part visibility change

2012-12-25 Thread Austin Clements
On Sun, 23 Dec 2012, Mark Walters  wrote:
> ---
> Here is some news for the recent part visibility change. I am not
> completely happy with the wording so please do read. In particular
> should it mention how the user's preferred part is chosen
> (notmuch-multipart/alternative-discouraged etc)?
>
> Best wishes
>
> Mark
>
>
>  NEWS |   10 ++
>  1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 3e48501..d290934 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -63,6 +63,16 @@ Removal of the deprecated `notmuch-folders` variable
>has now been removed. Any remaining users should migrate to
>`notmuch-saved-searches`.
>  
> +Visibility of mime parts can be toggled

MIME?

> +
> +  The visibility of mime parts that can be displayed in the emacs show
> +  buffer can be toggled by activating the part button (either by
> +  pressing RET on it or with a mouse click). In particular, the
> +  "alternative" parts of a multipart/alternative email can be viewed.
> +  Note: the default has changed and only the user preferred part of a
> +  multipart/alternative email is displayed. To restore the old
> +  behaviour set `notmuch-show-all-multipart/alternative-parts` to t.

  Each part of a multi-part MIME email can now be shown or hidden using
  the button at the top of each part (by pressing RET on it or by
  clicking).  For emails with multiple alternative formats (e.g., plain
  text and HTML), only the preferred format is shown initially, but
  other formats can be shown using their part buttons.  To control the
  behavior of this, see `notmuch-multipart/alternative-discouraged` and
  `notmuch-show-all-multipart/alternative-parts`.

?  I intentionally downplayed
notmuch-show-all-multipart/alternative-parts relative to your
description because I would love for people to forget about that
setting so we can eventually drop it.

> +
>  Emacs now buttonizes mid: links
>  
>mid: links are a standardized way to link to messages by message ID
> -- 
> 1.7.9.1
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


v2 of valgrind based memory tests

2012-12-25 Thread Austin Clements
On Mon, 24 Dec 2012, david at tethera.net wrote:
> These obsolete
>   
>   id:1355196820-29734-1-git-send-email-david at tethera.net
>
> I tried to follow the suggestions of 
>
>   id:20121216191121.GH6187 at mit.edu
>
> pretty closely.

LGTM.  In the long run, I wonder if we want to separate the time and
memory tests like this or if some tests would make sense in either mode.


[PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread Austin Clements
---
 man/man1/notmuch-dump.1 |   11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/man/man1/notmuch-dump.1 b/man/man1/notmuch-dump.1
index 770b00f..7bd6def 100644
--- a/man/man1/notmuch-dump.1
+++ b/man/man1/notmuch-dump.1
@@ -64,13 +64,16 @@ and tags containing whitespace or non-\fBascii\fR(7) 
characters.
 Each line has the form

 .RS 4
-.RI "+<" "encoded-tag" "> " "" "+<" "encoded-tag" "> ... -- " "" " id:<" 
encoded-message-id >
+.RI "+<" "encoded-tag" "> " "" "+<" "encoded-tag" "> ... -- " "" " id:<" 
quoted-message-id >

-where encoded means that every byte not matching the regex
+Tags are hex-encoded by replacing every byte not matching the regex
 .B [A-Za-z0-9@=.,_+-]
-is replace by
+with
 .B %nn
-where nn is the two digit hex encoding.
+where nn is the two digit hex encoding.  The message ID is a valid Xapian
+query, quoted using Xapian boolean term quoting rules: if the ID contains
+whitespace or a close paren or starts with a double quote, it must be
+enclosed in double quotes and double quotes inside the ID must be doubled.
 The astute reader will notice this is a special case of the batch input
 format for \fBnotmuch-tag\fR(1); note that the single message-id query is
 mandatory for \fBnotmuch-restore\fR(1).
-- 
1.7.10.4



[PATCH 4/5] dump/restore: Use Xapian queries for batch-tag format

2012-12-25 Thread Austin Clements
This switches the new batch-tag format away from using a home-grown
hex-encoding scheme for message IDs in the dump to simply using Xapian
queries with Xapian quoting syntax.

This has a variety of advantages beyond presenting a cleaner and more
consistent interface.  Foremost is that it will dramatically simplify
the quoting for batch tagging, which shares the same input format.
While the hex-encoding is no better or worse for the simple ID queries
used by dump/restore, it becomes onerous for general-purpose queries
used in batch tagging.  It also better handles strange cases like
"id:foo and bar", since this is no longer syntactically valid.
---
 notmuch-dump.c|9 +
 notmuch-restore.c |   22 ++
 tag-util.c|6 --
 test/dump-restore |   11 +--
 4 files changed, 20 insertions(+), 28 deletions(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index 29d79da..bf01a39 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -20,6 +20,7 @@

 #include "notmuch-client.h"
 #include "dump-restore-private.h"
+#include "string-util.h"

 int
 notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
@@ -141,13 +142,13 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
fprintf (stderr, "Error: cannot dump message id containing line 
break: %s\n", message_id);
return 1;
}
-   if (hex_encode (notmuch, message_id,
-   , _size) != HEX_SUCCESS) {
-   fprintf (stderr, "Error: failed to hex-encode msg-id %s\n",
+   if (make_boolean_term (notmuch, "id", message_id,
+  , _size)) {
+   fprintf (stderr, "Error: failed to quote message id %s\n",
 message_id);
return 1;
}
-   fprintf (output, " -- id:%s\n", buffer);
+   fprintf (output, " -- %s\n", buffer);
}

notmuch_message_destroy (message);
diff --git a/notmuch-restore.c b/notmuch-restore.c
index 9ed9b51..77a4c27 100644
--- a/notmuch-restore.c
+++ b/notmuch-restore.c
@@ -207,7 +207,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char 
*argv[])
INTERNAL_ERROR ("compile time constant regex failed.");

 do {
-   char *query_string;
+   char *query_string, *prefix, *term;

if (line_ctx != NULL)
talloc_free (line_ctx);
@@ -220,19 +220,17 @@ notmuch_restore_command (unused (void *ctx), int argc, 
char *argv[])
  _string, tag_ops);

if (ret == 0) {
-   if (strncmp ("id:", query_string, 3) != 0) {
-   fprintf (stderr, "Warning: unsupported query: %s\n", 
query_string);
+   ret = parse_boolean_term (line_ctx, query_string,
+ , );
+   if (ret) {
+   fprintf (stderr, "Warning: cannot parse query: %s\n",
+query_string);
+   continue;
+   } else if (strcmp ("id", prefix) != 0) {
+   fprintf (stderr, "Warning: not an id query: %s\n", 
query_string);
continue;
}
-   /* delete id: from front of string; tag_message
-* expects a raw message-id.
-*
-* XXX: Note that query string id:foo and bar will be
-* interpreted as a message id "foo and bar". This
-* should eventually be fixed to give a better error
-* message.
-*/
-   query_string = query_string + 3;
+   query_string = term;
}
}

diff --git a/tag-util.c b/tag-util.c
index 705b7ba..e4e5dda 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -124,12 +124,6 @@ parse_tag_line (void *ctx, char *line,
 }

 /* tok now points to the query string */
-if (hex_decode_inplace (tok) != HEX_SUCCESS) {
-   ret = line_error (TAG_PARSE_INVALID, line_for_error,
- "hex decoding of query %s failed", tok);
-   goto DONE;
-}
-
 *query_string = tok;

   DONE:
diff --git a/test/dump-restore b/test/dump-restore
index 6a989b6..aecc393 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -195,23 +195,22 @@ a

 # the previous line was blank; also no yelling please
 +%zz -- id:whatever
-+e +f id:%yy
++e +f id:"
++e +f tag:abc
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.GA7669 at griffis1.net
-# highlight the sketchy id parsing; this should be last
-+g -- id:foo and bar
 EOF

 cat < EXPECTED
-Warning: unsupported query: a
+Warning: cannot parse query: a
 Warning: no query string [+0]
 Warning: no query string [+a +b]
 Warning: missing query string [+a +b ]
 Warning: no query string after -- [+c +d --]
 Warning: hex decoding of tag %zz failed [+%zz 

[PATCH 3/5] dump: Disallow \n in message IDs

2012-12-25 Thread Austin Clements
When we switch to using regular Xapian queries in the dump format, \n
will cause problems, so we disallow it.  Specially, while Xapian can
quote and parse queries containing \n without difficultly, quoted
queries containing \n still span multiple lines, which breaks the
line-orientedness of the dump format.  Strictly speaking, we could
still round-trip these, but it would significantly complicate restore
as well as scripts that deal with tag dumps.  This complexity would
come at absolutely no benefit: because of the RFC 2822 unfolding
rules, no amount of standards negligence can produce a message with a
message ID containing a line break (not even Outlook can do it!).

Hence, we simply disallow it.
---
 notmuch-dump.c   |9 +
 test/random-corpus.c |2 ++
 2 files changed, 11 insertions(+)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index d2dad40..29d79da 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -132,6 +132,15 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
if (output_format == DUMP_FORMAT_SUP) {
fputs (")\n", output);
} else {
+   if (strchr (message_id, '\n')) {
+   /* This will produce a line break in the output, which
+* would be difficult to handle in tools.  However,
+* it's also impossible to produce an email containing
+* a line break in a message ID because of unfolding,
+* so we can safely disallow it. */
+   fprintf (stderr, "Error: cannot dump message id containing line 
break: %s\n", message_id);
+   return 1;
+   }
if (hex_encode (notmuch, message_id,
, _size) != HEX_SUCCESS) {
fprintf (stderr, "Error: failed to hex-encode msg-id %s\n",
diff --git a/test/random-corpus.c b/test/random-corpus.c
index f354d4b..d0e3e8f 100644
--- a/test/random-corpus.c
+++ b/test/random-corpus.c
@@ -97,6 +97,8 @@ random_utf8_string (void *ctx, size_t char_count)
}

randomchar = random_unichar ();
+   if (randomchar == '\n')
+   randomchar = 'x';

written = g_unichar_to_utf8 (randomchar, buf + offset);

-- 
1.7.10.4



[PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread Austin Clements
This reproduces Xapian's parsing rules for boolean term queries.  This
is provided as a generic string utility, but will be used shortly in
notmuch restore to parse and optimize for ID queries.
---
 util/string-util.c |   63 
 util/string-util.h |   11 +
 2 files changed, 74 insertions(+)

diff --git a/util/string-util.c b/util/string-util.c
index 161a4dd..eaa6c99 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -94,3 +94,66 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,

 return 0;
 }
+
+static int
+consume_double_quote (const char **str)
+{
+if (**str == '"') {
+   ++*str;
+   return 1;
+} else if (strncmp(*str, "\xe2\x80\x9c", 3) == 0 || /* UTF8 0x201c */
+  strncmp(*str, "\xe2\x80\x9d", 3) == 0) { /* UTF8 0x201d */
+   *str += 3;
+   return 3;
+} else {
+   return 0;
+}
+}
+
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out)
+{
+*prefix_out = *term_out = NULL;
+
+/* Parse prefix */
+const char *pos = strchr (str, ':');
+if (! pos)
+   goto FAIL;
+*prefix_out = talloc_strndup (ctx, str, pos - str);
+++pos;
+
+/* Implement Xapian's boolean term de-quoting.  This is a nearly
+ * direct translation of QueryParser::Internal::parse_query. */
+pos = *term_out = talloc_strdup (ctx, pos);
+if (consume_double_quote ()) {
+   char *out = talloc_strdup (ctx, pos);
+   pos = *term_out = out;
+   while (1) {
+   if (! *pos) {
+   /* Premature end of string */
+   goto FAIL;
+   } else if (*pos == '"') {
+   if (*++pos != '"')
+   break;
+   } else if (consume_double_quote ()) {
+   break;
+   }
+   *out++ = *pos++;
+   }
+   if (*pos)
+   goto FAIL;
+   *out = '\0';
+} else {
+   while (*pos > ' ' && *pos != ')')
+   ++pos;
+   if (*pos)
+   goto FAIL;
+}
+return 0;
+
+ FAIL:
+talloc_free (*prefix_out);
+talloc_free (*term_out);
+return 1;
+}
diff --git a/util/string-util.h b/util/string-util.h
index 7475e2c..e4e4c42 100644
--- a/util/string-util.h
+++ b/util/string-util.h
@@ -28,4 +28,15 @@ char *strtok_len (char *s, const char *delim, size_t *len);
 int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term,
   char **buf, size_t *len);

+/* Parse a boolean term query, returning the prefix in *prefix_out and
+ * the term in *term_out.  *prefix_out and *term_out will be talloc'd
+ * with context ctx.
+ *
+ * Return: 0 on success, non-zero on parse error (including trailing
+ * data in str).
+ */
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out);
+
 #endif
-- 
1.7.10.4



[PATCH 1/5] util: Factor out boolean term quoting routine

2012-12-25 Thread Austin Clements
From: Austin Clements 

This is now a generic boolean term quoting function.  It performs
minimal quoting to produce user-friendly queries.

This could live in tag-util as well, but it is really nothing specific
to tags (although the conventions are specific to Xapian).

The API is changed from "caller-allocates" to "readline-like".  The
scan for max tag length is pushed down into the quoting routine.
Furthermore, this now combines the term prefix with the quoted term;
arguably this is just as easy to do in the caller, but this will
nicely parallel the boolean term parsing function to be introduced
shortly.

This is an amalgamation of code written by David Bremner and myself.
---
 notmuch-tag.c  |   48 
 util/string-util.c |   62 
 util/string-util.h |9 
 3 files changed, 85 insertions(+), 34 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 88d559b..fc9d43a 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */

 #include "notmuch-client.h"
+#include "string-util.h"

 static volatile sig_atomic_t interrupted;

@@ -35,25 +36,6 @@ handle_sigint (unused (int sig))
 interrupted = 1;
 }

-static char *
-_escape_tag (char *buf, const char *tag)
-{
-const char *in = tag;
-char *out = buf;
-
-/* Boolean terms surrounded by double quotes can contain any
- * character.  Double quotes are quoted by doubling them. */
-*out++ = '"';
-while (*in) {
-   if (*in == '"')
-   *out++ = '"';
-   *out++ = *in++;
-}
-*out++ = '"';
-*out = 0;
-return buf;
-}
-
 typedef struct {
 const char *tag;
 notmuch_bool_t remove;
@@ -71,25 +53,16 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
  * parenthesize and the exclusion part of the query must not use
  * the '-' operator (though the NOT operator is fine). */

-char *escaped, *query_string;
+char *escaped = NULL;
+size_t escaped_len = 0;
+char *query_string;
 const char *join = "";
-int i;
-unsigned int max_tag_len = 0;
+size_t i;

 /* Don't optimize if there are no tag changes. */
 if (tag_ops[0].tag == NULL)
return talloc_strdup (ctx, orig_query_string);

-/* Allocate a buffer for escaping tags.  This is large enough to
- * hold a fully escaped tag with every character doubled plus
- * enclosing quotes and a NUL. */
-for (i = 0; tag_ops[i].tag; i++)
-   if (strlen (tag_ops[i].tag) > max_tag_len)
-   max_tag_len = strlen (tag_ops[i].tag);
-escaped = talloc_array (ctx, char, max_tag_len * 2 + 3);
-if (! escaped)
-   return NULL;
-
 /* Build the new query string */
 if (strcmp (orig_query_string, "*") == 0)
query_string = talloc_strdup (ctx, "(");
@@ -97,10 +70,17 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
query_string = talloc_asprintf (ctx, "( %s ) and (", orig_query_string);

 for (i = 0; tag_ops[i].tag && query_string; i++) {
+   /* XXX in case of OOM, query_string will be deallocated when
+* ctx is, which might be at shutdown */
+   if (make_boolean_term (ctx,
+  "tag", tag_ops[i].tag,
+  , _len))
+   return NULL;
+
query_string = talloc_asprintf_append_buffer (
-   query_string, "%s%stag:%s", join,
+   query_string, "%s%s%s", join,
tag_ops[i].remove ? "" : "not ",
-   _escape_tag (escaped, tag_ops[i].tag));
+   escaped);
join = " or ";
 }

diff --git a/util/string-util.c b/util/string-util.c
index 44f8cd3..161a4dd 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -20,6 +20,7 @@


 #include "string-util.h"
+#include "talloc.h"

 char *
 strtok_len (char *s, const char *delim, size_t *len)
@@ -32,3 +33,64 @@ strtok_len (char *s, const char *delim, size_t *len)

 return *len ? s : NULL;
 }
+
+int
+make_boolean_term (void *ctx, const char *prefix, const char *term,
+  char **buf, size_t *len)
+{
+const char *in;
+char *out;
+size_t needed = 3;
+int need_quoting = 0;
+
+/* Do we need quoting? */
+for (in = term; *in && !need_quoting; in++)
+   if (*in <= ' ' || *in == ')' || *in == '"')
+   need_quoting = 1;
+
+if (need_quoting)
+   for (in = term; *in; in++)
+   needed += (*in == '"') ? 2 : 1;
+else
+   needed = strlen (term) + 1;
+
+/* Reserve space for the prefix */
+if (prefix)
+   needed += strlen (prefix) + 1;
+
+if ((*buf == NULL) || (needed > *len)) {
+   *len = 2 * needed;
+   *buf = talloc_realloc (ctx, *buf, char, *len);
+}
+
+if (! *buf)
+   return 1;
+
+out = *buf;
+
+/* Copy in the prefix */
+if (prefix) {
+   strcpy (out, prefix);
+   out += strlen (prefix);
+   *out++ = ':';
+}
+
+if (! 

[PATCH 0/5] Use Xapian query syntax for batch-tag dump/restore

2012-12-25 Thread Austin Clements
This is a stab at tweaking the new batch-tag dump/restore format to be
more amenable to batch tagging.  Currently, the "query" part of each
line isn't really a Xapian query; it's a hex-encoded message ID
prefixed with "id:".  This is fine for the very limited case of
dump/restore, but it extends poorly to the general queries allowed by
batch tagging.  However, Xapian already has a perfectly good quoting
syntax.  This series switches the batch-tag format to use regular
Xapian queries, using only regular Xapian quoting syntax, so that we
don't have to invent yet another quoting syntax for batch tagging.



Re: v2 of valgrind based memory tests

2012-12-25 Thread Austin Clements
On Mon, 24 Dec 2012, da...@tethera.net wrote:
 These obsolete
   
   id:1355196820-29734-1-git-send-email-da...@tethera.net

 I tried to follow the suggestions of 

   id:20121216191121.gh6...@mit.edu

 pretty closely.

LGTM.  In the long run, I wonder if we want to separate the time and
memory tests like this or if some tests would make sense in either mode.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] NEWS for emacs part visibility change

2012-12-25 Thread Austin Clements
On Sun, 23 Dec 2012, Mark Walters markwalters1...@gmail.com wrote:
 ---
 Here is some news for the recent part visibility change. I am not
 completely happy with the wording so please do read. In particular
 should it mention how the user's preferred part is chosen
 (notmuch-multipart/alternative-discouraged etc)?

 Best wishes

 Mark


  NEWS |   10 ++
  1 files changed, 10 insertions(+), 0 deletions(-)

 diff --git a/NEWS b/NEWS
 index 3e48501..d290934 100644
 --- a/NEWS
 +++ b/NEWS
 @@ -63,6 +63,16 @@ Removal of the deprecated `notmuch-folders` variable
has now been removed. Any remaining users should migrate to
`notmuch-saved-searches`.
  
 +Visibility of mime parts can be toggled

MIME?

 +
 +  The visibility of mime parts that can be displayed in the emacs show
 +  buffer can be toggled by activating the part button (either by
 +  pressing RET on it or with a mouse click). In particular, the
 +  alternative parts of a multipart/alternative email can be viewed.
 +  Note: the default has changed and only the user preferred part of a
 +  multipart/alternative email is displayed. To restore the old
 +  behaviour set `notmuch-show-all-multipart/alternative-parts` to t.

  Each part of a multi-part MIME email can now be shown or hidden using
  the button at the top of each part (by pressing RET on it or by
  clicking).  For emails with multiple alternative formats (e.g., plain
  text and HTML), only the preferred format is shown initially, but
  other formats can be shown using their part buttons.  To control the
  behavior of this, see `notmuch-multipart/alternative-discouraged` and
  `notmuch-show-all-multipart/alternative-parts`.

?  I intentionally downplayed
notmuch-show-all-multipart/alternative-parts relative to your
description because I would love for people to forget about that
setting so we can eventually drop it.

 +
  Emacs now buttonizes mid: links
  
mid: links are a standardized way to link to messages by message ID
 -- 
 1.7.9.1

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] emacs: show: make id links respect window

2012-12-25 Thread Mark Walters

On Mon, 24 Dec 2012, David Bremner da...@tethera.net wrote:
 Mark Walters markwalters1...@gmail.com writes:

 I think this is a bug but that could be debated. It is particularly
 easy to trigger with notmuch pick because that uses the split pane
 while focus usually remains in the `pick' pane rather than the `show'
 pane.

 I can imagine that people would want/like the open in other window 
 effect of the current code, even if the reason is a bug.

That's definitely possible. I generally expect a mouse click to select
the window I click and this feels counter intuitive. I think that some
people might like an option open this link in a new window but I would
guess that would like that whether they clicked or pressed RET on the
button.

'action `(lambda (arg)
   (notmuch-show ,(third link)))

 Can you (or someone) explain why backquote is needed here? Is this some
 kind of workaround for lack of lexical scope? If so, maybe a comment.

I think this is a reasonably standard use: we want (third link)  to be
evaluated when the button is made, but we want the rest of the action to
be evaluated when the action is called. 

Best wishes

Mark


 d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] NEWS for emacs part visibility change

2012-12-25 Thread Mark Walters
Wording suggested by Austin.
---

I have taken Austin's (excellent) wording and agree with the
downplaying of notmuch-show-all-multipart/alternative-parts.

Best wishes

Mark


 NEWS |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/NEWS b/NEWS
index 0de685a..69f21a0 100644
--- a/NEWS
+++ b/NEWS
@@ -70,6 +70,17 @@ Removal of the deprecated `notmuch-folders` variable
   has now been removed. Any remaining users should migrate to
   `notmuch-saved-searches`.
 
+Visibility of MIME parts can be toggled
+
+  Each part of a multi-part MIME email can now be shown or hidden
+  using the button at the top of each part (by pressing RET on it or
+  by clicking).  For emails with multiple alternative formats (e.g.,
+  plain text and HTML), only the preferred format is shown initially,
+  but other formats can be shown using their part buttons.  To control
+  the behavior of this, see
+  `notmuch-multipart/alternative-discouraged` and
+  `notmuch-show-all-multipart/alternative-parts`.
+
 Emacs now buttonizes mid: links
 
   mid: links are a standardized way to link to messages by message ID
-- 
1.7.9.1

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] contrib: pick: close message pane when quitting from show in the message pane

2012-12-25 Thread Mark Walters
We add a hook to the show buffer in the message window to close the
message window when that buffer quits.  It checks that the
message-window is still displaying the show-message buffer and then
closes it.
---
This is just a slightly tidier version of the previous patch. It is
more natural to reuse the nomuch-pick-message-window variable (as it
serves the same pupose) rather than adding a new variable and
cluttering up the name space (and it also keeps the clutter in the
notmuch-pick namespace).

I think this a good candidate for 0.15: it fixes an annoying bug in pick. 

Best wishes

Mark


 contrib/notmuch-pick/notmuch-pick.el |   19 +++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/contrib/notmuch-pick/notmuch-pick.el 
b/contrib/notmuch-pick/notmuch-pick.el
index 1a553d4..7d01f4c 100644
--- a/contrib/notmuch-pick/notmuch-pick.el
+++ b/contrib/notmuch-pick/notmuch-pick.el
@@ -157,6 +157,10 @@
 (make-variable-buffer-local 'notmuch-pick-query-context)
 (defvar notmuch-pick-buffer-name nil)
 (make-variable-buffer-local 'notmuch-pick-buffer-name)
+;; This variable is the window used for the message pane. It is set
+;; in both the parent pick buffer and the child show buffer. It is
+;; used to try and close the message pane when quitting pick or the
+;; child show buffer.
 (defvar notmuch-pick-message-window nil)
 (make-variable-buffer-local 'notmuch-pick-message-window)
 (put 'notmuch-pick-message-window 'permanent-local t)
@@ -324,6 +328,16 @@ Does NOT change the database.
 (notmuch-prettify-subject (notmuch-search-find-subject)))
   (notmuch-pick-show-match-message-with-wait))
 
+(defun notmuch-pick-message-window-kill-hook ()
+  (let ((buffer (current-buffer)))
+(when (and (window-live-p notmuch-pick-message-window)
+  (eq (window-buffer notmuch-pick-message-window) buffer))
+  ;; We do not want an error if this is the sole window in the
+  ;; frame and I do not know how to test for that in emacs pre
+  ;; 24. Hence we just ignore-errors.
+  (ignore-errors
+   (delete-window notmuch-pick-message-window)
+
 (defun notmuch-pick-show-message ()
   Show the current message (in split-pane).
   (interactive)
@@ -341,6 +355,11 @@ Does NOT change the database.
(let ((notmuch-show-indent-messages-width 0))
  (setq current-prefix-arg '(4))
  (setq buffer (notmuch-show id nil nil nil
+  ;; We need the `let' as notmuch-pick-message-window is buffer local.
+  (let ((window notmuch-pick-message-window))
+   (with-current-buffer buffer
+ (setq notmuch-pick-message-window window)
+ (add-hook 'kill-buffer-hook 'notmuch-pick-message-window-kill-hook)))
   (notmuch-pick-tag-update-display (list -unread))
   (setq notmuch-pick-message-buffer buffer
 
-- 
1.7.9.1

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] contrib: pick: close message pane when quitting from show in the message pane

2012-12-25 Thread Tomi Ollila
On Tue, Dec 25 2012, Mark Walters markwalters1...@gmail.com wrote:

 We add a hook to the show buffer in the message window to close the
 message window when that buffer quits.  It checks that the
 message-window is still displaying the show-message buffer and then
 closes it.
 ---
 This is just a slightly tidier version of the previous patch. It is
 more natural to reuse the nomuch-pick-message-window variable (as it
 serves the same pupose) rather than adding a new variable and
 cluttering up the name space (and it also keeps the clutter in the
 notmuch-pick namespace).

 I think this a good candidate for 0.15: it fixes an annoying bug in pick. 

 Best wishes

 Mark


  contrib/notmuch-pick/notmuch-pick.el |   19 +++
  1 files changed, 19 insertions(+), 0 deletions(-)

 diff --git a/contrib/notmuch-pick/notmuch-pick.el 
 b/contrib/notmuch-pick/notmuch-pick.el
 index 1a553d4..7d01f4c 100644
 --- a/contrib/notmuch-pick/notmuch-pick.el
 +++ b/contrib/notmuch-pick/notmuch-pick.el
 @@ -157,6 +157,10 @@
  (make-variable-buffer-local 'notmuch-pick-query-context)
  (defvar notmuch-pick-buffer-name nil)
  (make-variable-buffer-local 'notmuch-pick-buffer-name)
 +;; This variable is the window used for the message pane. It is set
 +;; in both the parent pick buffer and the child show buffer. It is
 +;; used to try and close the message pane when quitting pick or the
 +;; child show buffer.
  (defvar notmuch-pick-message-window nil)
  (make-variable-buffer-local 'notmuch-pick-message-window)
  (put 'notmuch-pick-message-window 'permanent-local t)
 @@ -324,6 +328,16 @@ Does NOT change the database.
  (notmuch-prettify-subject (notmuch-search-find-subject)))
(notmuch-pick-show-match-message-with-wait))
  
 +(defun notmuch-pick-message-window-kill-hook ()
 +  (let ((buffer (current-buffer)))
 +(when (and (window-live-p notmuch-pick-message-window)
 +(eq (window-buffer notmuch-pick-message-window) buffer))
 +  ;; We do not want an error if this is the sole window in the
 +  ;; frame and I do not know how to test for that in emacs pre
 +  ;; 24. Hence we just ignore-errors.
 +  (ignore-errors
 + (delete-window notmuch-pick-message-window)
 +
  (defun notmuch-pick-show-message ()
Show the current message (in split-pane).
(interactive)
 @@ -341,6 +355,11 @@ Does NOT change the database.
   (let ((notmuch-show-indent-messages-width 0))
 (setq current-prefix-arg '(4))
 (setq buffer (notmuch-show id nil nil nil
 +  ;; We need the `let' as notmuch-pick-message-window is buffer local.
 +  (let ((window notmuch-pick-message-window))
 + (with-current-buffer buffer
 +   (setq notmuch-pick-message-window window)
 +   (add-hook 'kill-buffer-hook 'notmuch-pick-message-window-kill-hook)))
(notmuch-pick-tag-update-display (list -unread))
(setq notmuch-pick-message-buffer buffer


LGTM. 

Tomi

PS: (unrelated) Mark: check if notmuch-show-mark-read-tags should be used in
   line:
   (notmuch-pick-tag-update-display (list -unread))

 -- 
 1.7.9.1
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 1/5] util: Factor out boolean term quoting routine

2012-12-25 Thread David Bremner
Austin Clements amdra...@mit.edu writes:

 This could live in tag-util as well, but it is really nothing specific
 to tags (although the conventions are specific to Xapian).

 Furthermore, this now combines the term prefix with the quoted term;
 arguably this is just as easy to do in the caller, but this will
 nicely parallel the boolean term parsing function to be introduced
 shortly.

At first glance, I found this a bit too notmuch-specific to go in
util. On second glance, I found my first reaction somewhat
bizarre. Perhaps at some point we should drop a README file in util
explaining what should go there.

Other than that, LGTM
d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: v2 of valgrind based memory tests

2012-12-25 Thread David Bremner
Austin Clements amdra...@mit.edu writes:

 On Mon, 24 Dec 2012, da...@tethera.net wrote:

 LGTM.  In the long run, I wonder if we want to separate the time and
 memory tests like this or if some tests would make sense in either mode.

pushed. At the moment, the split into time versus memory tests is a
simple way to separate out heavy weight tests, but this could be done
more explicitely.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread David Bremner
Austin Clements amdra...@mit.edu writes:

 +if (consume_double_quote (pos)) {
 + char *out = talloc_strdup (ctx, pos);
 + pos = *term_out = out;
 + while (1) {

Overall the control flow here is a bit tricky to follow. I'm not sure if
a real loop condition would help or make it worse.

 + if (! *pos) {
 + /* Premature end of string */
 + goto FAIL;
 + } else if (*pos == '') {
 + if (*++pos != '')
 + break;
 + } else if (consume_double_quote (pos)) {
 + break;
 + }

I'm confused by the asymmetry here. Quoted strings can start with
unicode quotes, but internally can only have ascii ''? Is this
documented somewhere?

 +} else {
 + while (*pos  ' '  *pos != ')')
 + ++pos;
 + if (*pos)
 + goto FAIL;
 +}

So if there is no quote, we skip the part after the ':'? I guess I
probably missed something because that doesn't sound like the intended
behaviour.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 3/5] dump: Disallow \n in message IDs

2012-12-25 Thread David Bremner
Austin Clements amdra...@mit.edu writes:
  
   randomchar = random_unichar ();
 + if (randomchar == '\n')
 + randomchar = 'x';
  

what about something like 

do {
   randomchar = random_unichar ();
}  while (randomchar == '\n');
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread David Bremner
David Bremner da...@tethera.net writes:

 So if there is no quote, we skip the part after the ':'? I guess I
 probably missed something because that doesn't sound like the intended
 behaviour.

Indeed the following addition to the test shows it works fine in
context. So I guess I just don't follow this control flow very well.

diff --git a/test/dump-restore b/test/dump-restore
index aecc393..f9ae5b3 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -200,6 +200,8 @@ a
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.ga7...@griffis1.net
+# valid id, but warning about missing message
++e id:missing_message_id
 EOF
 
 cat EOF  EXPECTED
@@ -211,6 +213,7 @@ Warning: no query string after -- [+c +d --]
 Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]
 Warning: cannot parse query: id:
 Warning: not an id query: tag:abc
+Warning: cannot apply tags to missing message: missing_message_id
 EOF
 
 test_expect_equal_file EXPECTED OUTPUT
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread David Bremner

Overall I agree this is conceptually cleaner. Transforming between
quoting formats is inherently delicate; I suspect there will always be
at least one special case missed.  I did find the dequoting code a bit
baffling, but this is more of an implementation issue.

There is a certain cognitive dissonance in using hex-quoting for tags
and xapian-quoting for message-ids.  Did you think about the
implications of using xapian quoting for tags as well?

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread David Bremner
David Bremner da...@tethera.net writes:

 There is a certain cognitive dissonance in using hex-quoting for tags
 and xapian-quoting for message-ids.  Did you think about the
 implications of using xapian quoting for tags as well?


At risk of talking to myself, I guess one loses the space delimited
lists of tags, which is probably pretty handy for scripting.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-25 Thread David Bremner
Austin Clements amdra...@mit.edu writes:

 ---
  man/man1/notmuch-dump.1 |   11 +++
  1 file changed, 7 insertions(+), 4 deletions(-)


I guess the short description in notmuch-restore.1 needs updating as
well.

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Xapian-quoting based batch-tagging.

2012-12-25 Thread david
This is an alternative version of 

 id:1356313183-9266-1-git-send-email-da...@tethera.net

batch tagging patches rebased on top of 

 id:1356415076-5692-1-git-send-email-amdra...@mit.edu

This mainly consisted of removing 

 [Patch v9 04/17] notmuch-tag: factor out double quoting routine
 (superceded by one of Austin's patches)

 [Patch v9 05/17] util/string-util: add a new string tokenized function
 [Patch v9 06/17] unhex_and_quote: new function to quote hex-decoded queries
 [Patch v9 07/17] notmuch-restore: move query handling for batch
 (uneeded if query is passed verbatim to xapian)

 I also removed two tests, since they are about how we handle
 quoting:

 [Patch v9 13/17] test/tagging: add test for compound queries with batch 
tagging
 [Patch v9 17/17] test/tagging: add test for handling of parenthesized  
tag queries.

A few small fixes were needed to the tests, and a fair amount of
changes to the notmuch-tag man page.

Diffstat (against Austin's series) is as follows

 man/man1/notmuch-tag.1 |   99 -
 notmuch-tag.c  |  169 
 tag-util.c |   87 +--
 tag-util.h |   15 
 test/tagging   |  195 ++
 5 files changed, 480 insertions(+), 85 deletions(-)


___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 01/11] parse_tag_line: use enum for return value.

2012-12-25 Thread david
From: David Bremner brem...@debian.org

This is essentially cosmetic, since success=0 is promised by
the comments in tag-utils.h.
---
 tag-util.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tag-util.c b/tag-util.c
index e4e5dda..ca12b3b 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -40,14 +40,14 @@ parse_tag_line (void *ctx, char *line,
 char *tok = line;
 size_t tok_len = 0;
 char *line_for_error;
-int ret = 0;
+tag_parse_status_t ret = TAG_PARSE_SUCCESS;
 
 chomp_newline (line);
 
 line_for_error = talloc_strdup (ctx, line);
 if (line_for_error == NULL) {
fprintf (stderr, Error: out of memory\n);
-   return -1;
+   return TAG_PARSE_OUT_OF_MEMORY;
 }
 
 /* remove leading space */
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 03/11] notmuch-tag.c: convert to use tag-utils

2012-12-25 Thread david
From: David Bremner brem...@debian.org

Command line parsing is factored out into a function
parse_tag_command_line in tag-utils.c.

There is some duplicated code eliminated in tag_query, and a bunch of
translation from using the bare tag_op structs to using that tag-utils
API.
---
 notmuch-tag.c |   99 -
 tag-util.c|   51 +++--
 tag-util.h|   15 +
 3 files changed, 84 insertions(+), 81 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index fc9d43a..8129912 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */
 
 #include notmuch-client.h
+#include tag-util.h
 #include string-util.h
 
 static volatile sig_atomic_t interrupted;
@@ -36,14 +37,10 @@ handle_sigint (unused (int sig))
 interrupted = 1;
 }
 
-typedef struct {
-const char *tag;
-notmuch_bool_t remove;
-} tag_operation_t;
 
 static char *
 _optimize_tag_query (void *ctx, const char *orig_query_string,
-const tag_operation_t *tag_ops)
+const tag_op_list_t *list)
 {
 /* This is subtler than it looks.  Xapian ignores the '-' operator
  * at the beginning both queries and parenthesized groups and,
@@ -60,7 +57,7 @@ _optimize_tag_query (void *ctx, const char *orig_query_string,
 size_t i;
 
 /* Don't optimize if there are no tag changes. */
-if (tag_ops[0].tag == NULL)
+if (tag_op_list_size (list) == 0)
return talloc_strdup (ctx, orig_query_string);
 
 /* Build the new query string */
@@ -69,17 +66,17 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
 else
query_string = talloc_asprintf (ctx, ( %s ) and (, orig_query_string);
 
-for (i = 0; tag_ops[i].tag  query_string; i++) {
+for (i = 0; i  tag_op_list_size (list)  query_string; i++) {
/* XXX in case of OOM, query_string will be deallocated when
 * ctx is, which might be at shutdown */
if (make_boolean_term (ctx,
-  tag, tag_ops[i].tag,
+  tag, tag_op_list_tag (list, i),
   escaped, escaped_len))
return NULL;
 
query_string = talloc_asprintf_append_buffer (
query_string, %s%s%s, join,
-   tag_ops[i].remove ?  : not ,
+   tag_op_list_isremove (list, i) ?  : not ,
escaped);
join =  or ;
 }
@@ -91,17 +88,15 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
 return query_string;
 }
 
-/* Tag messages matching 'query_string' according to 'tag_ops', which
- * must be an array of tagging operations terminated with an empty
- * element. */
+/* Tag messages matching 'query_string' according to 'tag_ops'
+ */
 static int
 tag_query (void *ctx, notmuch_database_t *notmuch, const char *query_string,
-  tag_operation_t *tag_ops, notmuch_bool_t synchronize_flags)
+  tag_op_list_t *tag_ops, tag_op_flag_t flags)
 {
 notmuch_query_t *query;
 notmuch_messages_t *messages;
 notmuch_message_t *message;
-int i;
 
 /* Optimize the query so it excludes messages that already have
  * the specified set of tags. */
@@ -124,21 +119,7 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const 
char *query_string,
 notmuch_messages_valid (messages)  ! interrupted;
 notmuch_messages_move_to_next (messages)) {
message = notmuch_messages_get (messages);
-
-   notmuch_message_freeze (message);
-
-   for (i = 0; tag_ops[i].tag; i++) {
-   if (tag_ops[i].remove)
-   notmuch_message_remove_tag (message, tag_ops[i].tag);
-   else
-   notmuch_message_add_tag (message, tag_ops[i].tag);
-   }
-
-   notmuch_message_thaw (message);
-
-   if (synchronize_flags)
-   notmuch_message_tags_to_maildir_flags (message);
-
+   tag_op_list_apply (message, tag_ops, flags | TAG_FLAG_PRE_OPTIMIZED);
notmuch_message_destroy (message);
 }
 
@@ -150,15 +131,13 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const 
char *query_string,
 int
 notmuch_tag_command (void *ctx, int argc, char *argv[])
 {
-tag_operation_t *tag_ops;
-int tag_ops_count = 0;
-char *query_string;
+tag_op_list_t *tag_ops = NULL;
+char *query_string = NULL;
 notmuch_config_t *config;
 notmuch_database_t *notmuch;
 struct sigaction action;
-notmuch_bool_t synchronize_flags;
-int i;
-int ret;
+tag_op_flag_t tag_flags = TAG_FLAG_NONE;
+int ret = 0;
 
 /* Setup our handler for SIGINT */
 memset (action, 0, sizeof (struct sigaction));
@@ -167,54 +146,15 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 action.sa_flags = SA_RESTART;
 sigaction (SIGINT, action, NULL);
 
-argc--; argv++; /* skip subcommand argument */
-
-/* Array of tagging operations (add or remove), terminated with an
- * empty element. */
-

[PATCH 02/11] tag-util: factor out rules for illegal tags, use in parse_tag_line

2012-12-25 Thread david
From: David Bremner brem...@debian.org

This will allow us to be consistent between batch tagging and command
line tagging as far as what is an illegal tag.
---
 tag-util.c |   36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/tag-util.c b/tag-util.c
index ca12b3b..0a4fe78 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -31,6 +31,30 @@ line_error (tag_parse_status_t status,
 return status;
 }
 
+/*
+ * Test tags for some forbidden cases.
+ *
+ * return: NULL if OK,
+ *explanatory message otherwise.
+ */
+
+static const char *
+illegal_tag (const char *tag, notmuch_bool_t remove)
+{
+
+if (*tag == '\0'  ! remove)
+   return empty tag forbidden;
+
+/* This disallows adding the non-removable tag - and
+ * enables notmuch tag to take long options more easily.
+ */
+
+if (*tag == '-'  ! remove)
+   return tag starting with '-' forbidden;
+
+return NULL;
+}
+
 tag_parse_status_t
 parse_tag_line (void *ctx, char *line,
tag_op_flag_t flags,
@@ -95,11 +119,13 @@ parse_tag_line (void *ctx, char *line,
remove = (*tok == '-');
tag = tok + 1;
 
-   /* Maybe refuse empty tags. */
-   if (! (flags  TAG_FLAG_BE_GENEROUS)  *tag == '\0') {
-   ret = line_error (TAG_PARSE_INVALID, line_for_error,
- empty tag);
-   goto DONE;
+   /* Maybe refuse illegal tags. */
+   if (! (flags  TAG_FLAG_BE_GENEROUS)) {
+   const char *msg = illegal_tag (tag, remove);
+   if (msg) {
+   ret = line_error (TAG_PARSE_INVALID, line_for_error, msg);
+   goto DONE;
+   }
}
 
/* Decode tag. */
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 05/11] test/tagging: add test for error messages of tag --batch

2012-12-25 Thread david
From: David Bremner brem...@debian.org

This is based on the similar test for notmuch restore, but the parser
in batch tagging mode is less tolerant of a few cases, in particular
those tested by illegal_tag.
---
 test/tagging |   35 +++
 1 file changed, 35 insertions(+)

diff --git a/test/tagging b/test/tagging
index 980ff92..cd16585 100755
--- a/test/tagging
+++ b/test/tagging
@@ -46,6 +46,41 @@ test_expect_equal $output \
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (:\  inbox tag1 unread)
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag1 tag4 unread)
 
+test_begin_subtest '--batch: checking error messages'
+notmuch dump --format=batch-tag  BACKUP
+notmuch tag --batch EOF 2OUTPUT
+# the next line has a space
+ 
+# this line has no tag operations, but this is permitted in batch format.
+a
++0
++a +b
+# trailing whitespace
++a +b 
++c +d --
+# this is a harmless comment, do not yell about it.
+
+# the previous line was blank; also no yelling please
++%zz -- id:whatever
+# the next non-comment line should report an an empty tag error for
+# batch tagging, but not for restore
++ +e -- id:foo
++- -- id:foo
+EOF
+
+cat EOF  EXPECTED
+Warning: no query string [+0]
+Warning: no query string [+a +b]
+Warning: missing query string [+a +b ]
+Warning: no query string after -- [+c +d --]
+Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]
+Warning: empty tag forbidden [+ +e -- id:foo]
+Warning: tag starting with '-' forbidden [+- -- id:foo]
+EOF
+
+notmuch restore --format=batch-tag  BACKUP
+test_expect_equal_file EXPECTED OUTPUT
+
 test_expect_code 1 Empty tag names 'notmuch tag + One'
 
 test_expect_code 1 Tag name beginning with - 'notmuch tag +- One'
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 08/11] test/tagging: add test for exotic message-ids and batch tagging

2012-12-25 Thread david
From: David Bremner brem...@debian.org

The (now fixed) bug that this test revealed is that unquoted
message-ids with whitespace or other control characters in them are
split into several tokens by the Xapian query parser.
---
 test/tagging |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/test/tagging b/test/tagging
index 417112b..1717e72 100755
--- a/test/tagging
+++ b/test/tagging
@@ -198,6 +198,24 @@ notmuch dump --format=batch-tag | sort  OUTPUT
 notmuch restore --format=batch-tag  BACKUP
 test_expect_equal_file EXPECTED OUTPUT
 
+test_begin_subtest '--batch: unicode message-ids'
+
+${TEST_DIRECTORY}/random-corpus --config-path=${NOTMUCH_CONFIG} \
+ --num-messages=100
+
+notmuch dump --format=batch-tag | sed 's/^.* -- /+common_tag -- /' | \
+sort  EXPECTED
+
+notmuch dump --format=batch-tag | sed 's/^.* -- /  -- /' | \
+notmuch restore --format=batch-tag
+
+notmuch tag --batch  EXPECTED
+
+notmuch dump --format=batch-tag| \
+sort  OUTPUT
+
+test_expect_equal_file EXPECTED OUTPUT
+
 test_expect_code 1 Empty tag names 'notmuch tag + One'
 
 test_expect_code 1 Tag name beginning with - 'notmuch tag +- One'
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 09/11] notmuch-tag.1: tidy synopsis formatting, reference

2012-12-25 Thread david
From: David Bremner brem...@debian.org

Consistently use [...]; one less space. Use singular search-term
---
 man/man1/notmuch-tag.1 |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/man/man1/notmuch-tag.1 b/man/man1/notmuch-tag.1
index 0f86582..9444aa4 100644
--- a/man/man1/notmuch-tag.1
+++ b/man/man1/notmuch-tag.1
@@ -4,20 +4,21 @@ notmuch-tag \- add/remove tags for all messages matching the 
search terms
 
 .SH SYNOPSIS
 .B notmuch tag
-.RI  + tag |\- tag  [...] [\-\-]  search-term ...
+.RI + tag |\- tag  [...] [\-\-]  search-term  [...]
 
 .SH DESCRIPTION
 
 Add/remove tags for all messages matching the search terms.
 
 See \fBnotmuch-search-terms\fR(7)
-for details of the supported syntax for search-terms.
+for details of the supported syntax for
+.RI  search-term .
 
 Tags prefixed by '+' are added while those prefixed by '\-' are
 removed. For each message, tag removal is performed before tag
 addition.
 
-The beginning of search-terms is recognized by the first
+The beginning of the search terms is recognized by the first
 argument that begins with neither '+' nor '\-'. Support for
 an initial search term beginning with '+' or '\-' is provided
 by allowing the user to specify a \-\- argument to separate
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 04/11] cli: add support for batch tagging operations to notmuch tag

2012-12-25 Thread david
From: Jani Nikula j...@nikula.org

Add support for batch tagging operations through stdin to notmuch
tag. This can be enabled with the new --batch command line option to
notmuch tag. The input must consist of lines of the format:

+tag|-tag [...] [--] query [...]

Each line is interpreted similarly to notmuch tag command line
arguments. The delimiter is one or more spaces ' '. Any characters in
tag MAP be hex encoded with %NN where NN is the
hexadecimal value of the character. Any ' ' and '%' characters in
tag and search-term MUST be hex encoded (using %20 and %25,
respectively). Any characters that are not part of tag or
MUST NOT be hex encoded.

query is passed verbatim to Xapian

Leading and trailing space ' ' is ignored. Empty lines and lines
beginning with '#' are ignored.

Signed-off-by: Jani Nikula j...@nikula.org

Hacked-like-crazy-by: David Bremner da...@tethera.net
---
 notmuch-tag.c |   94 -
 1 file changed, 86 insertions(+), 8 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 8129912..7fc614d 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -128,6 +128,46 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const 
char *query_string,
 return interrupted;
 }
 
+static int
+tag_file (void *ctx, notmuch_database_t *notmuch, tag_op_flag_t flags,
+ FILE *input)
+{
+char *line = NULL;
+char *query_string = NULL;
+size_t line_size = 0;
+ssize_t line_len;
+int ret = 0;
+tag_op_list_t *tag_ops;
+
+tag_ops = tag_op_list_create (ctx);
+if (tag_ops == NULL) {
+   fprintf (stderr, Out of memory.\n);
+   return 1;
+}
+
+while ((line_len = getline (line, line_size, input)) != -1 
+  ! interrupted) {
+
+   ret = parse_tag_line (ctx, line, TAG_FLAG_NONE,
+ query_string, tag_ops);
+
+   if (ret  0)
+   continue;
+
+   if (ret  0)
+   break;
+
+   ret = tag_query (ctx, notmuch, query_string, tag_ops, flags);
+   if (ret)
+   break;
+}
+
+if (line)
+   free (line);
+
+return ret;
+}
+
 int
 notmuch_tag_command (void *ctx, int argc, char *argv[])
 {
@@ -137,6 +177,10 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 notmuch_database_t *notmuch;
 struct sigaction action;
 tag_op_flag_t tag_flags = TAG_FLAG_NONE;
+notmuch_bool_t batch = FALSE;
+FILE *input = stdin;
+char *input_file_name = NULL;
+int opt_index;
 int ret = 0;
 
 /* Setup our handler for SIGINT */
@@ -146,15 +190,43 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 action.sa_flags = SA_RESTART;
 sigaction (SIGINT, action, NULL);
 
-tag_ops = tag_op_list_create (ctx);
-if (tag_ops == NULL) {
-   fprintf (stderr, Out of memory.\n);
+notmuch_opt_desc_t options[] = {
+   { NOTMUCH_OPT_BOOLEAN, batch, batch, 0, 0 },
+   { NOTMUCH_OPT_STRING, input_file_name, input, 'i', 0 },
+   { 0, 0, 0, 0, 0 }
+};
+
+opt_index = parse_arguments (argc, argv, options, 1);
+if (opt_index  0)
return 1;
+
+if (input_file_name) {
+   batch = TRUE;
+   input = fopen (input_file_name, r);
+   if (input == NULL) {
+   fprintf (stderr, Error opening %s for reading: %s\n,
+input_file_name, strerror (errno));
+   return 1;
+   }
 }
 
-if (parse_tag_command_line (ctx, argc - 1, argv + 1,
-   query_string, tag_ops))
-   return 1;
+if (batch) {
+   if (opt_index != argc) {
+   fprintf (stderr, Can't specify both cmdline and stdin!\n);
+   return 1;
+   }
+} else {
+
+   tag_ops = tag_op_list_create (ctx);
+   if (tag_ops == NULL) {
+   fprintf (stderr, Out of memory.\n);
+   return 1;
+   }
+
+   if (parse_tag_command_line (ctx, argc - opt_index, argv + opt_index,
+   query_string, tag_ops))
+   return 1;
+}
 
 config = notmuch_config_open (ctx, NULL, NULL);
 if (config == NULL)
@@ -167,9 +239,15 @@ notmuch_tag_command (void *ctx, int argc, char *argv[])
 if (notmuch_config_get_maildir_synchronize_flags (config))
tag_flags |= TAG_FLAG_MAILDIR_SYNC;
 
-ret = tag_query (ctx, notmuch, query_string, tag_ops, tag_flags);
+if (batch)
+   ret = tag_file (ctx, notmuch, tag_flags, input);
+else
+   ret = tag_query (ctx, notmuch, query_string, tag_ops, tag_flags);
 
 notmuch_database_destroy (notmuch);
 
-return ret;
+if (input != stdin)
+   fclose (input);
+
+return ret || interrupted;
 }
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 06/11] test/tagging: add basic tests for batch tagging functionality

2012-12-25 Thread david
From: David Bremner brem...@debian.org

This tests argument parsing, blank lines and comments, and basic hex
decoding functionality.
---
 test/tagging |   51 +++
 1 file changed, 51 insertions(+)

diff --git a/test/tagging b/test/tagging
index cd16585..405ad7c 100755
--- a/test/tagging
+++ b/test/tagging
@@ -46,6 +46,57 @@ test_expect_equal $output \
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (:\  inbox tag1 unread)
 thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag1 tag4 unread)
 
+test_begin_subtest --batch
+notmuch tag --batch EOF
+# %20 is a space in tag
+-:%20 -tag1 +tag5 +tag6 -- One
++tag1 -tag1 -tag4 +tag4 -- Two
+-tag6 One
++tag5 Two
+EOF
+output=$(notmuch search \* | notmuch_search_sanitize)
+test_expect_equal $output \
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (inbox tag5 unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag4 tag5 unread)
+
+# generate a common input file for the next several tests.
+cat  batch.in  EOF
+# %40 is an @ in tag
++%40 -tag5 +tag6 -- One
++tag1 -tag1 -tag4 +tag4 -- Two
+-tag5 +tag6 Two
+EOF
+
+cat  batch.expected EOF
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; One (@ inbox tag6 unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Two (inbox tag4 tag6 unread)
+EOF
+
+test_begin_subtest --input
+notmuch dump --format=batch-tag  backup.tags
+notmuch tag --input=batch.in
+notmuch search \* | notmuch_search_sanitize  OUTPUT
+notmuch restore --format=batch-tag  backup.tags
+test_expect_equal_file batch.expected OUTPUT
+
+test_begin_subtest --batch --input
+notmuch dump --format=batch-tag  backup.tags
+notmuch tag --batch --input=batch.in
+notmuch search \* | notmuch_search_sanitize  OUTPUT
+notmuch restore --format=batch-tag  backup.tags
+test_expect_equal_file batch.expected OUTPUT
+
+test_begin_subtest --batch, blank lines and comments
+notmuch dump | sort  EXPECTED
+notmuch tag --batch EOF
+# this line is a comment; the next has only white space
+
+
+# the previous line is empty
+EOF
+notmuch dump | sort  OUTPUT
+test_expect_equal_file EXPECTED OUTPUT
+
 test_begin_subtest '--batch: checking error messages'
 notmuch dump --format=batch-tag  BACKUP
 notmuch tag --batch EOF 2OUTPUT
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 11/11] test/tagging: add test for naked punctuation in tags; compare with quoting spaces.

2012-12-25 Thread david
From: David Bremner brem...@debian.org

This test also serves as documentation of the quoting
requirements. The comment lines are so that it exactly matches the man
page. Nothing more embarrassing than having an example in the man page
fail.
---
 test/tagging |   25 +
 1 file changed, 25 insertions(+)

diff --git a/test/tagging b/test/tagging
index 1717e72..1f5632c 100755
--- a/test/tagging
+++ b/test/tagging
@@ -198,6 +198,31 @@ notmuch dump --format=batch-tag | sort  OUTPUT
 notmuch restore --format=batch-tag  BACKUP
 test_expect_equal_file EXPECTED OUTPUT
 
+test_begin_subtest --batch: only space and % needs to be encoded.
+notmuch dump --format=batch-tag  BACKUP
+
+notmuch tag --batch EOF
++winner *
++foo::bar%25 -- (One and Two) or (One and tag:winner)
++found::it -- tag:foo::bar%
+# ignore this line and the next
+
++space%20in%20tags -- Two
+# add tag '(tags)', among other stunts.
++crazy{ +(tags) +are +#possible\ -- tag:space in tags
++match*crazy -- tag:crazy{
++some_tag -- id:this is nauty)
+EOF
+
+cat EOF  EXPECTED
++%23possible%5c +%26are +%28tags%29 +crazy%7b +inbox +match%2acrazy 
+space%20in%20tags +tag4 +tag5 +unread +winner -- id:msg-002@notmuch-test-suite
++foo%3a%3abar%25 +found%3a%3ait +inbox +tag5 +unread +winner -- 
id:msg-001@notmuch-test-suite
+EOF
+
+notmuch dump --format=batch-tag | sort  OUTPUT
+notmuch restore --format=batch-tag  BACKUP
+test_expect_equal_file EXPECTED OUTPUT
+
 test_begin_subtest '--batch: unicode message-ids'
 
 ${TEST_DIRECTORY}/random-corpus --config-path=${NOTMUCH_CONFIG} \
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 10/11] man: document notmuch tag --batch, --input options

2012-12-25 Thread david
From: Jani Nikula j...@nikula.org

---
 man/man1/notmuch-tag.1 |   92 
 1 file changed, 92 insertions(+)

diff --git a/man/man1/notmuch-tag.1 b/man/man1/notmuch-tag.1
index 9444aa4..3aa2fa5 100644
--- a/man/man1/notmuch-tag.1
+++ b/man/man1/notmuch-tag.1
@@ -6,6 +6,11 @@ notmuch-tag \- add/remove tags for all messages matching the 
search terms
 .B notmuch tag
 .RI + tag |\- tag  [...] [\-\-]  search-term  [...]
 
+.B notmuch tag
+.RI --batch
+.RI [ --input= filename  ]
+
+
 .SH DESCRIPTION
 
 Add/remove tags for all messages matching the search terms.
@@ -30,6 +35,93 @@ updates the maildir flags according to tag changes if the
 configuration option is enabled. See \fBnotmuch-config\fR(1) for
 details.
 
+Supported options for
+.B tag
+include
+.RS 4
+.TP 4
+.BR \-\-batch
+
+Read batch tagging operations from a file (stdin by default). This is more
+efficient than repeated
+.B notmuch tag
+invocations. See
+.B TAG FILE FORMAT
+below for the input format. This option is not compatible with
+specifying tagging on the command line.
+.RE
+
+.RS 4
+.TP 4
+.BR \-\-input= filename
+
+Read input from given file, instead of from stdin. Implies
+.BR --batch .
+
+.SH TAG FILE FORMAT
+
+The input must consist of lines of the format:
+
+.RI + tag |\- tag  [...] [\-\-]  query 
+
+Each line is interpreted similarly to
+.B notmuch tag
+command line arguments. The delimiter is one or more spaces ' '. Any
+characters in
+.RI  tag 
+.B may
+be hex-encoded with %NN where NN is the hexadecimal value of the
+character. To hex-encode a character with a multi-byte UTF-8 encoding,
+hex-encode each byte.
+Any spaces in tag
+.B must
+be hex-encoded as %20. Any characters that are not
+part of
+.RI   tag 
+.B must not
+be hex-encoded.
+
+In the future tag:tag with spaces style quoting may be supported for
+.RI  tag 
+as well;
+for this reason all double quote characters in
+.RI  tag 
+.B should
+be hex-encoded.
+
+The
+.RI  query 
+should be quoted using Xapian boolean term quoting rules: if a term
+contains whitespace or a close paren or starts with a double quote, it
+must be enclosed in double quotes (not including any prefix) and
+double quotes inside the term must be doubled (see below for
+examples).
+
+Leading and trailing space ' ' is ignored. Empty lines and lines
+beginning with '#' are ignored.
+
+.SS EXAMPLE
+
+The following shows a valid input to batch tagging. Note that only the
+isolated '*' acts as a wildcard. Also note the two different quotings
+of the tag
+.B space in tags
+.
+.RS
+.nf
++winner *
++foo::bar%25 -- (One and Two) or (One and tag:winner)
++found::it -- tag:foo::bar%
+# ignore this line and the next
+
++space%20in%20tags -- Two
+# add tag '(tags)', among other stunts.
++crazy{ +(tags) +are +#possible\ -- tag:space in tags
++match*crazy -- tag:crazy{
++some_tag -- id:this is nauty)
+.fi
+.RE
+
 .SH SEE ALSO
 
 \fBnotmuch\fR(1), \fBnotmuch-config\fR(1), \fBnotmuch-count\fR(1),
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] NEWS for emacs part visibility change

2012-12-25 Thread David Bremner
Mark Walters markwalters1...@gmail.com writes:

 Wording suggested by Austin.
 ---

pushed.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] emacs: tweak error buffer handling

2012-12-25 Thread Tomi Ollila
On Sat, Dec 22 2012, Mark Walters markwalters1...@gmail.com wrote:

 view-mode-enter changed between emacs 23 and emacs 24: the current
 code makes the error buffer disappear in emacs 24 on quitting it (ie
 pressing q) but this just kills the buffer without closing the split
 window in emacs 23.

 This patch makes the error buffer window disappear in emacs 23
 too. Since the view-mode-enter function changed we have to test for
 version and do the correct thing in each case.
 ---

 This seems to work but I have only tested on 23.4 and 24.2

I run emacs 23.1.1 to get the documentation of view-mode-enter
there. So, this patch instructs to delete WINDOW when exiting
view mode...

Documentation of pop-to-buffer says:

Select buffer BUFFER-OR-NAME in some window, preferably a different one.

What if pop-up-windows's value is nil -- the content of current window
is replaced with this view stuff -- and when exiting view mode, the
window will be deleted ? What happens with emacs 24 in this case ?

Tomi

 Best wishes

 Mark



  emacs/notmuch-lib.el |8 +---
  1 files changed, 5 insertions(+), 3 deletions(-)

 diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
 index 77a591d..0407f8a 100644
 --- a/emacs/notmuch-lib.el
 +++ b/emacs/notmuch-lib.el
 @@ -324,15 +324,17 @@ the user dismisses it.
  
(let ((buf (get-buffer-create *Notmuch errors*)))
  (with-current-buffer buf
 -  (view-mode-enter nil #'kill-buffer)
 +  (pop-to-buffer buf)
 +  (view-mode-enter (when ( emacs-major-version 24)
 +(cons (selected-window) (cons nil t)))
 +#'kill-buffer)
(let ((inhibit-read-only t))
   (goto-char (point-max))
   (unless (bobp)
 (insert \n))
   (insert msg)
   (unless (bolp)
 -   (insert \n
 -(pop-to-buffer buf)))
 +   (insert \n))
  
  (defun notmuch-check-async-exit-status (proc msg)
If PROC exited abnormally, pop up an error buffer and signal an error.
 -- 
 1.7.9.1

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 2/5] util: Function to parse boolean term queries

2012-12-25 Thread Austin Clements
Quoth David Bremner on Dec 25 at 10:18 am:
 Austin Clements amdra...@mit.edu writes:
 
  +if (consume_double_quote (pos)) {
  +   char *out = talloc_strdup (ctx, pos);
  +   pos = *term_out = out;
  +   while (1) {
 
 Overall the control flow here is a bit tricky to follow. I'm not sure if
 a real loop condition would help or make it worse.
 
  +   if (! *pos) {
  +   /* Premature end of string */
  +   goto FAIL;
  +   } else if (*pos == '') {
  +   if (*++pos != '')
  +   break;
  +   } else if (consume_double_quote (pos)) {
  +   break;
  +   }
 
 I'm confused by the asymmetry here. Quoted strings can start with
 unicode quotes, but internally can only have ascii ''? Is this
 documented somewhere?

Only in the source, to my knowledge.  Here's how Xapian parses these
things (where 'it' is a UTF8 string iterator):

if (it != end  is_double_quote(*it)) {
// Quoted boolean term (can contain any character).
++it;
while (it != end) {
if (*it == '') {
// Interpret  as an escaped .
if (++it == end || *it != '')
break;
} else if (is_double_quote(*it)) {
++it;
break;
}
Unicode::append_utf8(name, *it++);
}
} else {
// Can't boolean filter prefix a subexpression, so
// just use anything following the prefix until the
// next space or ')' as part of the boolean filter
// term.
while (it != end  *it  ' '  *it != ')')
Unicode::append_utf8(name, *it++);
}

  +} else {
  +   while (*pos  ' '  *pos != ')')
  +   ++pos;
  +   if (*pos)
  +   goto FAIL;
  +}
 
 So if there is no quote, we skip the part after the ':'? I guess I
 probably missed something because that doesn't sound like the intended
 behaviour.

This isn't skipping it; it's checking its well-formedness.  In this
case, *term_out already points to a correct string that can be used
literally; we just have to check that there's no trailing garbage
after the boolean query.

This is certainly worth commenting.

For the record, I also tried passing the query straight to the
library, without parsing it in the CLI (and simply checking that the
query returned exactly one result), and it was noticeably slower (the
restore performance test took between 3.82 and 5.25 seconds for the
code in this series and ~7.2 seconds using a general query.)
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] contrib: pick: close message pane when quitting from show in the message pane

2012-12-25 Thread David Bremner
Mark Walters markwalters1...@gmail.com writes:

 We add a hook to the show buffer in the message window to close the
 message window when that buffer quits.  It checks that the
 message-window is still displaying the show-message buffer and then
 closes it.

pushed.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 2/5] util: Function to parse boolean term queries

2012-12-25 Thread Austin Clements
This parses the subset of Xapian's boolean term quoting rules that are
used by make_boolean_term.  This is provided as a generic string
utility, but will be used shortly in notmuch restore to parse and
optimize for ID queries.
---
 util/string-util.c |   51 +++
 util/string-util.h |   11 +++
 2 files changed, 62 insertions(+)

diff --git a/util/string-util.c b/util/string-util.c
index e4bea21..db01b4b 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -96,3 +96,54 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,
 
 return 0;
 }
+
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out)
+{
+*prefix_out = *term_out = NULL;
+
+/* Parse prefix */
+const char *pos = strchr (str, ':');
+if (! pos)
+   goto FAIL;
+*prefix_out = talloc_strndup (ctx, str, pos - str);
+++pos;
+
+/* Implement de-quoting compatible with make_boolean_term. */
+if (*pos == '') {
+   char *out = talloc_strdup (ctx, pos + 1);
+   int closed = 0;
+   /* Find the closing quote and un-double doubled internal
+* quotes. */
+   for (pos = *term_out = out; *pos; ) {
+   if (*pos == '') {
+   ++pos;
+   if (*pos != '') {
+   /* Found the closing quote. */
+   closed = 1;
+   break;
+   }
+   }
+   *out++ = *pos++;
+   }
+   /* Did the term terminate without a closing quote or is there
+* trailing text after the closing quote? */
+   if (!closed || *pos)
+   goto FAIL;
+   *out = '\0';
+} else {
+   *term_out = talloc_strdup (ctx, pos);
+   /* Check for text after the boolean term. */
+   while (*pos  ' '  *pos != ')')
+   ++pos;
+   if (*pos)
+   goto FAIL;
+}
+return 0;
+
+ FAIL:
+talloc_free (*prefix_out);
+talloc_free (*term_out);
+return 1;
+}
diff --git a/util/string-util.h b/util/string-util.h
index 7475e2c..aff2d65 100644
--- a/util/string-util.h
+++ b/util/string-util.h
@@ -28,4 +28,15 @@ char *strtok_len (char *s, const char *delim, size_t *len);
 int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term,
   char **buf, size_t *len);
 
+/* Parse a boolean term query produced by make_boolean_term, returning
+ * the prefix in *prefix_out and the term in *term_out.  *prefix_out
+ * and *term_out will be talloc'd with context ctx.
+ *
+ * Return: 0 on success, non-zero on parse error (including trailing
+ * data in str).
+ */
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out);
+
 #endif
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 5/5] man: Update notmuch-dump(1) and notmuch-restore(1)

2012-12-25 Thread Austin Clements
Describe the new batch-tag format.  For notmuch-restore, rather than
half-heartedly duplicating the description, we now cite notmuch-dump.
---
 man/man1/notmuch-dump.1|   11 +++
 man/man1/notmuch-restore.1 |6 ++
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/man/man1/notmuch-dump.1 b/man/man1/notmuch-dump.1
index 770b00f..7bd6def 100644
--- a/man/man1/notmuch-dump.1
+++ b/man/man1/notmuch-dump.1
@@ -64,13 +64,16 @@ and tags containing whitespace or non-\fBascii\fR(7) 
characters.
 Each line has the form
 
 .RS 4
-.RI + encoded-tag+ encoded-tag  ... --id: 
encoded-message-id 
+.RI + encoded-tag+ encoded-tag  ... --id: 
quoted-message-id 
 
-where encoded means that every byte not matching the regex
+Tags are hex-encoded by replacing every byte not matching the regex
 .B [A-Za-z0-9@=.,_+-]
-is replace by
+with
 .B %nn
-where nn is the two digit hex encoding.
+where nn is the two digit hex encoding.  The message ID is a valid Xapian
+query, quoted using Xapian boolean term quoting rules: if the ID contains
+whitespace or a close paren or starts with a double quote, it must be
+enclosed in double quotes and double quotes inside the ID must be doubled.
 The astute reader will notice this is a special case of the batch input
 format for \fBnotmuch-tag\fR(1); note that the single message-id query is
 mandatory for \fBnotmuch-restore\fR(1).
diff --git a/man/man1/notmuch-restore.1 b/man/man1/notmuch-restore.1
index 6bba628..78fef52 100644
--- a/man/man1/notmuch-restore.1
+++ b/man/man1/notmuch-restore.1
@@ -57,10 +57,8 @@ sup calls them).
 The
 .B batch-tag
 dump format is intended to more robust against malformed message-ids
-and tags containing whitespace or non-\fBascii\fR(7) characters.  This
-format hex-escapes all characters those outside of a small character
-set, intended to be suitable for e.g. pathnames in most UNIX-like
-systems.
+and tags containing whitespace or non-\fBascii\fR(7) characters.  See
+\fBnotmuch-dump\fR(1) for details on this format.
 
 .B notmuch restore
 updates the maildir flags according to tag changes if the
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 1/5] util: Factor out boolean term quoting routine

2012-12-25 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This is now a generic boolean term quoting function.  It performs
minimal quoting to produce user-friendly queries.

This could live in tag-util as well, but it is really nothing specific
to tags (although the conventions are specific to Xapian).

The API is changed from caller-allocates to readline-like.  The
scan for max tag length is pushed down into the quoting routine.
Furthermore, this now combines the term prefix with the quoted term;
arguably this is just as easy to do in the caller, but this will
nicely parallel the boolean term parsing function to be introduced
shortly.

This is an amalgamation of code written by David Bremner and myself.
---
 notmuch-tag.c  |   48 ---
 util/string-util.c |   64 
 util/string-util.h |9 
 3 files changed, 87 insertions(+), 34 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 88d559b..fc9d43a 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */
 
 #include notmuch-client.h
+#include string-util.h
 
 static volatile sig_atomic_t interrupted;
 
@@ -35,25 +36,6 @@ handle_sigint (unused (int sig))
 interrupted = 1;
 }
 
-static char *
-_escape_tag (char *buf, const char *tag)
-{
-const char *in = tag;
-char *out = buf;
-
-/* Boolean terms surrounded by double quotes can contain any
- * character.  Double quotes are quoted by doubling them. */
-*out++ = '';
-while (*in) {
-   if (*in == '')
-   *out++ = '';
-   *out++ = *in++;
-}
-*out++ = '';
-*out = 0;
-return buf;
-}
-
 typedef struct {
 const char *tag;
 notmuch_bool_t remove;
@@ -71,25 +53,16 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
  * parenthesize and the exclusion part of the query must not use
  * the '-' operator (though the NOT operator is fine). */
 
-char *escaped, *query_string;
+char *escaped = NULL;
+size_t escaped_len = 0;
+char *query_string;
 const char *join = ;
-int i;
-unsigned int max_tag_len = 0;
+size_t i;
 
 /* Don't optimize if there are no tag changes. */
 if (tag_ops[0].tag == NULL)
return talloc_strdup (ctx, orig_query_string);
 
-/* Allocate a buffer for escaping tags.  This is large enough to
- * hold a fully escaped tag with every character doubled plus
- * enclosing quotes and a NUL. */
-for (i = 0; tag_ops[i].tag; i++)
-   if (strlen (tag_ops[i].tag)  max_tag_len)
-   max_tag_len = strlen (tag_ops[i].tag);
-escaped = talloc_array (ctx, char, max_tag_len * 2 + 3);
-if (! escaped)
-   return NULL;
-
 /* Build the new query string */
 if (strcmp (orig_query_string, *) == 0)
query_string = talloc_strdup (ctx, ();
@@ -97,10 +70,17 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
query_string = talloc_asprintf (ctx, ( %s ) and (, orig_query_string);
 
 for (i = 0; tag_ops[i].tag  query_string; i++) {
+   /* XXX in case of OOM, query_string will be deallocated when
+* ctx is, which might be at shutdown */
+   if (make_boolean_term (ctx,
+  tag, tag_ops[i].tag,
+  escaped, escaped_len))
+   return NULL;
+
query_string = talloc_asprintf_append_buffer (
-   query_string, %s%stag:%s, join,
+   query_string, %s%s%s, join,
tag_ops[i].remove ?  : not ,
-   _escape_tag (escaped, tag_ops[i].tag));
+   escaped);
join =  or ;
 }
 
diff --git a/util/string-util.c b/util/string-util.c
index 44f8cd3..e4bea21 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -20,6 +20,7 @@
 
 
 #include string-util.h
+#include talloc.h
 
 char *
 strtok_len (char *s, const char *delim, size_t *len)
@@ -32,3 +33,66 @@ strtok_len (char *s, const char *delim, size_t *len)
 
 return *len ? s : NULL;
 }
+
+int
+make_boolean_term (void *ctx, const char *prefix, const char *term,
+  char **buf, size_t *len)
+{
+const char *in;
+char *out;
+size_t needed = 3;
+int need_quoting = 0;
+
+/* Do we need quoting?  To be paranoid, we quote anything
+ * containing a quote, even though it only matters at the
+ * beginning, and anything containing non-ASCII text. */
+for (in = term; *in  !need_quoting; in++)
+   if (*in = ' ' || *in == ')' || *in == '' || (unsigned char)*in  127)
+   need_quoting = 1;
+
+if (need_quoting)
+   for (in = term; *in; in++)
+   needed += (*in == '') ? 2 : 1;
+else
+   needed = strlen (term) + 1;
+
+/* Reserve space for the prefix */
+if (prefix)
+   needed += strlen (prefix) + 1;
+
+if ((*buf == NULL) || (needed  *len)) {
+   *len = 2 * needed;
+   *buf = talloc_realloc (ctx, *buf, char, *len);
+}
+
+if (! *buf)
+   return 1;
+
+out 

[PATCH v2 0/5] Use Xapian query syntax for batch-tag dump/restore

2012-12-25 Thread Austin Clements
This obsoletes

  id:1356415076-5692-1-git-send-email-amdra...@mit.edu

In addition to incorporating all of David's suggestions, this reworks
the boolean term parsing so it only handles the subset of quoting
syntax used by make_boolean_term (which also happens to be all that we
described in the man page for the format).  The diff from v1 is below.

diff --git a/man/man1/notmuch-restore.1 b/man/man1/notmuch-restore.1
index 6bba628..78fef52 100644
--- a/man/man1/notmuch-restore.1
+++ b/man/man1/notmuch-restore.1
@@ -57,10 +57,8 @@ sup calls them).
 The
 .B batch-tag
 dump format is intended to more robust against malformed message-ids
-and tags containing whitespace or non-\fBascii\fR(7) characters.  This
-format hex-escapes all characters those outside of a small character
-set, intended to be suitable for e.g. pathnames in most UNIX-like
-systems.
+and tags containing whitespace or non-\fBascii\fR(7) characters.  See
+\fBnotmuch-dump\fR(1) for details on this format.
 
 .B notmuch restore
 updates the maildir flags according to tag changes if the
diff --git a/test/dump-restore b/test/dump-restore
index aecc393..f9ae5b3 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -200,6 +200,8 @@ a
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.ga7...@griffis1.net
+# valid id, but warning about missing message
++e id:missing_message_id
 EOF
 
 cat EOF  EXPECTED
@@ -211,6 +213,7 @@ Warning: no query string after -- [+c +d --]
 Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]
 Warning: cannot parse query: id:
 Warning: not an id query: tag:abc
+Warning: cannot apply tags to missing message: missing_message_id
 EOF
 
 test_expect_equal_file EXPECTED OUTPUT
diff --git a/test/random-corpus.c b/test/random-corpus.c
index d0e3e8f..8b7748e 100644
--- a/test/random-corpus.c
+++ b/test/random-corpus.c
@@ -96,9 +96,9 @@ random_utf8_string (void *ctx, size_t char_count)
buf = talloc_realloc (ctx, buf, gchar, buf_size);
}
 
-   randomchar = random_unichar ();
-   if (randomchar == '\n')
-   randomchar = 'x';
+   do {
+   randomchar = random_unichar ();
+   } while (randomchar == '\n');
 
written = g_unichar_to_utf8 (randomchar, buf + offset);
 
diff --git a/util/string-util.c b/util/string-util.c
index eaa6c99..db01b4b 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -43,9 +43,11 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,
 size_t needed = 3;
 int need_quoting = 0;
 
-/* Do we need quoting? */
+/* Do we need quoting?  To be paranoid, we quote anything
+ * containing a quote, even though it only matters at the
+ * beginning, and anything containing non-ASCII text. */
 for (in = term; *in  !need_quoting; in++)
-   if (*in = ' ' || *in == ')' || *in == '')
+   if (*in = ' ' || *in == ')' || *in == '' || (unsigned char)*in  127)
need_quoting = 1;
 
 if (need_quoting)
@@ -95,21 +97,6 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,
 return 0;
 }
 
-static int
-consume_double_quote (const char **str)
-{
-if (**str == '') {
-   ++*str;
-   return 1;
-} else if (strncmp(*str, \xe2\x80\x9c, 3) == 0 || /* UTF8 0x201c */
-  strncmp(*str, \xe2\x80\x9d, 3) == 0) { /* UTF8 0x201d */
-   *str += 3;
-   return 3;
-} else {
-   return 0;
-}
-}
-
 int
 parse_boolean_term (void *ctx, const char *str,
char **prefix_out, char **term_out)
@@ -123,28 +110,31 @@ parse_boolean_term (void *ctx, const char *str,
 *prefix_out = talloc_strndup (ctx, str, pos - str);
 ++pos;
 
-/* Implement Xapian's boolean term de-quoting.  This is a nearly
- * direct translation of QueryParser::Internal::parse_query. */
-pos = *term_out = talloc_strdup (ctx, pos);
-if (consume_double_quote (pos)) {
-   char *out = talloc_strdup (ctx, pos);
-   pos = *term_out = out;
-   while (1) {
-   if (! *pos) {
-   /* Premature end of string */
-   goto FAIL;
-   } else if (*pos == '') {
-   if (*++pos != '')
+/* Implement de-quoting compatible with make_boolean_term. */
+if (*pos == '') {
+   char *out = talloc_strdup (ctx, pos + 1);
+   int closed = 0;
+   /* Find the closing quote and un-double doubled internal
+* quotes. */
+   for (pos = *term_out = out; *pos; ) {
+   if (*pos == '') {
+   ++pos;
+   if (*pos != '') {
+   /* Found the closing quote. */
+   closed = 1;
break;
-   } else if (consume_double_quote (pos)) {
-   break;
+   }
}
*out++ = *pos++;
}
-   if (*pos)
+   /* Did the term terminate without a closing quote or is there
+* trailing text 

[PATCH v2 4/5] dump/restore: Use Xapian queries for batch-tag format

2012-12-25 Thread Austin Clements
This switches the new batch-tag format away from using a home-grown
hex-encoding scheme for message IDs in the dump to simply using Xapian
queries with Xapian quoting syntax.

This has a variety of advantages beyond presenting a cleaner and more
consistent interface.  Foremost is that it will dramatically simplify
the quoting for batch tagging, which shares the same input format.
While the hex-encoding is no better or worse for the simple ID queries
used by dump/restore, it becomes onerous for general-purpose queries
used in batch tagging.  It also better handles strange cases like
id:foo and bar, since this is no longer syntactically valid.
---
 notmuch-dump.c|9 +
 notmuch-restore.c |   22 ++
 tag-util.c|6 --
 test/dump-restore |   14 --
 4 files changed, 23 insertions(+), 28 deletions(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index 29d79da..bf01a39 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -20,6 +20,7 @@
 
 #include notmuch-client.h
 #include dump-restore-private.h
+#include string-util.h
 
 int
 notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
@@ -141,13 +142,13 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
fprintf (stderr, Error: cannot dump message id containing line 
break: %s\n, message_id);
return 1;
}
-   if (hex_encode (notmuch, message_id,
-   buffer, buffer_size) != HEX_SUCCESS) {
-   fprintf (stderr, Error: failed to hex-encode msg-id %s\n,
+   if (make_boolean_term (notmuch, id, message_id,
+  buffer, buffer_size)) {
+   fprintf (stderr, Error: failed to quote message id %s\n,
 message_id);
return 1;
}
-   fprintf (output,  -- id:%s\n, buffer);
+   fprintf (output,  -- %s\n, buffer);
}
 
notmuch_message_destroy (message);
diff --git a/notmuch-restore.c b/notmuch-restore.c
index 9ed9b51..77a4c27 100644
--- a/notmuch-restore.c
+++ b/notmuch-restore.c
@@ -207,7 +207,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char 
*argv[])
INTERNAL_ERROR (compile time constant regex failed.);
 
 do {
-   char *query_string;
+   char *query_string, *prefix, *term;
 
if (line_ctx != NULL)
talloc_free (line_ctx);
@@ -220,19 +220,17 @@ notmuch_restore_command (unused (void *ctx), int argc, 
char *argv[])
  query_string, tag_ops);
 
if (ret == 0) {
-   if (strncmp (id:, query_string, 3) != 0) {
-   fprintf (stderr, Warning: unsupported query: %s\n, 
query_string);
+   ret = parse_boolean_term (line_ctx, query_string,
+ prefix, term);
+   if (ret) {
+   fprintf (stderr, Warning: cannot parse query: %s\n,
+query_string);
+   continue;
+   } else if (strcmp (id, prefix) != 0) {
+   fprintf (stderr, Warning: not an id query: %s\n, 
query_string);
continue;
}
-   /* delete id: from front of string; tag_message
-* expects a raw message-id.
-*
-* XXX: Note that query string id:foo and bar will be
-* interpreted as a message id foo and bar. This
-* should eventually be fixed to give a better error
-* message.
-*/
-   query_string = query_string + 3;
+   query_string = term;
}
}
 
diff --git a/tag-util.c b/tag-util.c
index 705b7ba..e4e5dda 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -124,12 +124,6 @@ parse_tag_line (void *ctx, char *line,
 }
 
 /* tok now points to the query string */
-if (hex_decode_inplace (tok) != HEX_SUCCESS) {
-   ret = line_error (TAG_PARSE_INVALID, line_for_error,
- hex decoding of query %s failed, tok);
-   goto DONE;
-}
-
 *query_string = tok;
 
   DONE:
diff --git a/test/dump-restore b/test/dump-restore
index 6a989b6..f9ae5b3 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -195,23 +195,25 @@ a
 
 # the previous line was blank; also no yelling please
 +%zz -- id:whatever
-+e +f id:%yy
++e +f id:
++e +f tag:abc
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.ga7...@griffis1.net
-# highlight the sketchy id parsing; this should be last
-+g -- id:foo and bar
+# valid id, but warning about missing message
++e id:missing_message_id
 EOF
 
 cat EOF  EXPECTED
-Warning: unsupported query: a
+Warning: cannot parse query: a
 Warning: no query string [+0]
 Warning: no query string [+a +b]
 Warning: missing query string [+a +b ]