[PATCH 1/1] lib/message-file.c: use g_malloc () & g_free () in hash table values

2012-12-24 Thread David Bremner
Tomi Ollila  writes:

>
> The remaining frees and allocations referencing to message->headers hash
> values have been changed to use g_free and g_malloc functions.
>

pushed,

d


[PATCH] _notmuch_message_index_file: unref (free) address lists from gmime.

2012-12-24 Thread David Bremner
david at tethera.net writes:

> From: David Bremner 
>
> Apparently as of GMime 2.4, you don't need to call
> internet_address_list_destroy anymore, but you still need to call
> g_object_unref (from the GMime Changelog).
>

pushed, 

d


[PATCH] contrib: pick: slightly tweak running search and pick from pick buffer

2012-12-24 Thread David Bremner
Mark Walters  writes:

> Previously running search or pick from the pick buffer did not close
> the message pane (if open). This meant that then new search ends up in
> a very small window. Fix this so that the message pane is
> shut. However, make it so that the pane is shut after the search
> string is entered in case the user is basing the search on something
> in the current message.

pushed,

d


[PATCH 1/1] lib/message-file.c: use g_malloc () & g_free () in hash table values

2012-12-24 Thread Austin Clements
LGTM.

On Fri, 21 Dec 2012, Tomi Ollila  wrote:
> The message->headers hash table values get data returned by
> g_mime_utils_header_decode_text ().
>
> The pointer returned by g_mime_utils_header_decode_text is from the
> following line in rfc2047_decode_tokens
>
> return g_string_free (decoded, FALSE);
>
> The docs for g_string_free say
>
>  Frees the memory allocated for the GString. If free_segment is TRUE
>  it also frees the character data. If it's FALSE, the caller gains
>  ownership of the buffer and must free it after use with g_free().
>
> The remaining frees and allocations referencing to message->headers hash
> values have been changed to use g_free and g_malloc functions.
>
> This combines and completes the changes started by David Bremner.
> ---
>
> This was meant to be in reply to id:87mwxkptqn.fsf at zancas.localnet
> but I fumbled it. ;)
>
>  lib/message-file.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/lib/message-file.c b/lib/message-file.c
> index 915aba8..4d9af89 100644
> --- a/lib/message-file.c
> +++ b/lib/message-file.c
> @@ -111,7 +111,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char 
> *filename)
>  message->headers = g_hash_table_new_full (strcase_hash,
> strcase_equal,
> free,
> -   free);
> +   g_free);
>  
>  message->parsing_started = 0;
>  message->parsing_finished = 0;
> @@ -337,11 +337,11 @@ notmuch_message_file_get_header (notmuch_message_file_t 
> *message,
>   /* we need to add the header to those we already collected */
>   newhdr = strlen(decoded_value);
>   hdrsofar = strlen(header_sofar);
> - combined_header = xmalloc(hdrsofar + newhdr + 2);
> + combined_header = g_malloc(hdrsofar + newhdr + 2);
>   strncpy(combined_header,header_sofar,hdrsofar);
>   *(combined_header+hdrsofar) = ' ';
>   strncpy(combined_header+hdrsofar+1,decoded_value,newhdr+1);
> - free (decoded_value);
> + g_free (decoded_value);
>   g_hash_table_insert (message->headers, header, combined_header);
>   }
>   } else {
> @@ -350,7 +350,7 @@ notmuch_message_file_get_header (notmuch_message_file_t 
> *message,
>   g_hash_table_insert (message->headers, header, decoded_value);
>   } else {
>   free (header);
> - free (decoded_value);
> + g_free (decoded_value);
>   decoded_value = header_sofar;
>   }
>   }
> -- 
> 1.8.0
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] _notmuch_message_index_file: unref (free) address lists from gmime.

2012-12-24 Thread Austin Clements
This patch LGTM.

On Mon, 10 Dec 2012, david at tethera.net wrote:
> From: David Bremner 
>
> Apparently as of GMime 2.4, you don't need to call
> internet_address_list_destroy anymore, but you still need to call
> g_object_unref (from the GMime Changelog).
>
> On the medium performance corpus, valgrind shows "possibly lost"
> leakage in "notmuch new" dropping from 7M to 300k.
> ---
>  lib/index.cc |   12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/lib/index.cc b/lib/index.cc
> index da0e6ce..a2edd6d 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -484,12 +484,18 @@ mboxes is deprecated and may be removed in the 
> future.\n", filename);
>  }
>  
>  from = g_mime_message_get_sender (mime_message);
> -addresses = internet_address_list_parse_string (from);
>  
> -_index_address_list (message, "from", addresses);
> +addresses = internet_address_list_parse_string (from);
> +if (addresses) {
> + _index_address_list (message, "from", addresses);
> + g_object_unref (addresses);
> +}
>  
>  addresses = g_mime_message_get_all_recipients (mime_message);
> -_index_address_list (message, "to", addresses);
> +if (addresses) {
> + _index_address_list (message, "to", addresses);
> + g_object_unref (addresses);
> +}
>  
>  subject = g_mime_message_get_subject (mime_message);
>  _notmuch_message_gen_terms (message, "subject", subject);
> -- 
> 1.7.10.4
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] emacs: show: make id links respect window

2012-12-24 Thread David Bremner
Mark Walters  writes:

> I think this is a bug but that could be debated. It is particularly
> easy to trigger with notmuch pick because that uses the split pane
> while focus usually remains in the `pick' pane rather than the `show'
> pane.

I can imagine that people would want/like the "open in other window" 
effect of the current code, even if the reason is a bug.

> 'action `(lambda (arg)
>(notmuch-show ,(third link)))

Can you (or someone) explain why backquote is needed here? Is this some
kind of workaround for lack of lexical scope? If so, maybe a comment.

d


[PATCH] contrib: pick: slightly tweak running search and pick from pick buffer

2012-12-24 Thread David Bremner
Mark Walters  writes:

> Previously running search or pick from the pick buffer did not close
> the message pane (if open). This meant that then new search ends up in
> a very small window. Fix this so that the message pane is
> shut. However, make it so that the pane is shut after the search
> string is entered in case the user is basing the search on something
> in the current message.
> ---

Seems fine to me, queuing to push.

d


[Patch v2 4/4] perf-test: add memory leak test for dump restore

2012-12-24 Thread da...@tethera.net
From: David Bremner 

In id:87vcc2q5n2.fsf at nikula.org, Jani points out a memory leak in the
current version of the sup restore code. Among other things, this test
is intended to verify a fix for that leak.
---
 performance-test/M01-dump-restore |   15 +++
 1 file changed, 15 insertions(+)
 create mode 100755 performance-test/M01-dump-restore

diff --git a/performance-test/M01-dump-restore 
b/performance-test/M01-dump-restore
new file mode 100755
index 000..be5894a
--- /dev/null
+++ b/performance-test/M01-dump-restore
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+test_description='dump and restore'
+
+. ./perf-test-lib.sh
+
+memory_start
+
+memory_run 'load nmbug tags' 'notmuch restore --accumulate 
--input=corpus.tags/nmbug.sup-dump'
+memory_run 'dump *' 'notmuch dump --output=tags.sup'
+memory_run 'restore *' 'notmuch restore --input=tags.sup'
+memory_run 'dump --format=batch-tag *' 'notmuch dump --format=batch-tag 
--output=tags.bt'
+memory_run 'restore --format=batch-tag *' 'notmuch restore --format=batch-tag 
--input=tags.bt'
+
+memory_done
-- 
1.7.10.4



[Patch v2 3/4] perf-test: initial version of memory test infrastructure.

2012-12-24 Thread da...@tethera.net
From: David Bremner 

The idea is run some code under valgrind --leak-check=full and report
a summary, leaving the user to peruse the log file if they want.

We go to some lengths to preserve the log files from accidental
overwriting; the full corpus takes about 3 hours to run under valgrind
on my machine.

The naming of the log directories may be slightly controversial; in
the unlikely event of two runs in less than a second, the log will be
overwritten. A previous version with mktemp+timestamp was dismissed as
overkill; just mktemp alone does not sort nicely.

One new test is included, to check notmuch new for memory leaks.
---
 performance-test/.gitignore   |1 +
 performance-test/M00-new  |   15 
 performance-test/Makefile.local   |   16 ++--
 performance-test/README   |   57 +++-
 performance-test/perf-test-lib.sh |   75 -
 5 files changed, 126 insertions(+), 38 deletions(-)
 create mode 100755 performance-test/M00-new

diff --git a/performance-test/.gitignore b/performance-test/.gitignore
index 6421a9a..f3f9be4 100644
--- a/performance-test/.gitignore
+++ b/performance-test/.gitignore
@@ -1,3 +1,4 @@
 tmp.*/
+log.*/
 corpus/
 notmuch.cache.*/
diff --git a/performance-test/M00-new b/performance-test/M00-new
new file mode 100755
index 000..99c3f52
--- /dev/null
+++ b/performance-test/M00-new
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+test_description='notmuch new'
+
+. ./perf-test-lib.sh
+
+# ensure initial 'notmuch new' is run by memory_start
+uncache_database
+
+memory_start
+
+# run 'notmuch new' a second time, to test different code paths
+memory_run "notmuch new" "notmuch new"
+
+memory_done
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 57beb44..73aa963 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -4,14 +4,24 @@ dir := performance-test

 include $(dir)/version.sh

+TIME_TEST_SCRIPT := ${dir}/notmuch-time-test
+MEMORY_TEST_SCRIPT := ${dir}/notmuch-memory-test
+
 CORPUS_NAME := notmuch-email-corpus-$(PERFTEST_VERSION).tar.xz
 TXZFILE := ${dir}/download/${CORPUS_NAME}
 SIGFILE := ${TXZFILE}.asc
-TEST_SCRIPT := ${dir}/notmuch-perf-test
 DEFAULT_URL :=  http://notmuchmail.org/releases/${CORPUS_NAME}

+perf-test: time-test memory-test
+
 time-test: setup-perf-test all
-   $(TEST_SCRIPT) $(OPTIONS)
+   @echo
+   $(TIME_TEST_SCRIPT) $(OPTIONS)
+
+memory-test: setup-perf-test all
+   @echo
+   $(MEMORY_TEST_SCRIPT) $(OPTIONS)
+

 .PHONY: download-corpus setup-perf-test

@@ -29,4 +39,4 @@ $(TXZFILE):
 download-corpus:
wget -O ${TXZFILE} ${DEFAULT_URL}

-CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus $(dir)/notmuch.cache.*
+CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/log.* $(dir)/corpus 
$(dir)/notmuch.cache.*
diff --git a/performance-test/README b/performance-test/README
index d1fb6de..996724c 100644
--- a/performance-test/README
+++ b/performance-test/README
@@ -1,3 +1,10 @@
+Performance Tests
+-
+
+This directory contains two kinds of performance tests: time tests,
+and memory tests. The former use gnu time, and the latter use
+valgrind.
+
 Pre-requisites
 --

@@ -5,9 +12,10 @@ In addition to having notmuch, you need:

 - gpg
 - gnu tar
-- gnu time
+- gnu time (for the time tests)
 - xz. Some speedup can be gotten by installing "pixz", but this is
   probably only worthwhile if you are debugging the tests.
+- valgrind (for the memory tests)

 Getting set up to run tests:
 
@@ -36,34 +44,47 @@ for a list of mirrors.
 Running tests
 -

-The easiest way to run performance tests is to say "make time-test", (or
-simply run the notmuch-time-test script). Either command will run all
-available performance tests.
-
-Alternately, you can run a specific subset of tests by simply invoking
-one of the executable scripts in this directory, (such as ./basic).
-Each test script supports the following arguments
+The easiest way to run performance tests is to say "make perf-test".
+This will run all time and memory tests.  Be aware that the memory
+tests are quite time consuming when run on the full corpus, and that
+depending on your interests it may be more sensible to run "make
+time-test" or "make memory-test".  You can also invoke one of the
+scripts notmuch-time-test or notmuch-memory-test or run a more
+specific subset of tests by simply invoking one of the executable
+scripts in this directory, (such as ./T00-new).  Each test script
+supports the following arguments

 --small / --medium / --large   Choose corpus size.
 --debugEnable debugging. In particular don't 
delete
temporary directories.

+When using the make targets, you can pass arguments to all test
+scripts by defining the make variable OPTIONS.
+
 Writing tests
 -

-Have a look at "T01-dump-restore" for an 

[Patch v2 2/4] perf-test: rename current tests as "time tests"

2012-12-24 Thread da...@tethera.net
From: David Bremner 

This is almost entirely renaming files, except for updating a few
references to those file names, and changing the makefile target.

A new set of memory tests will be run separately because they take
much longer.
---
 performance-test/00-new|   15 ---
 performance-test/01-dump-restore   |   13 -
 performance-test/02-tag|   14 --
 performance-test/Makefile.local|2 +-
 performance-test/README|9 +
 performance-test/T00-new   |   15 +++
 performance-test/T01-dump-restore  |   13 +
 performance-test/T02-tag   |   14 ++
 performance-test/notmuch-perf-test |   27 ---
 performance-test/notmuch-time-test |   27 +++
 10 files changed, 75 insertions(+), 74 deletions(-)
 delete mode 100755 performance-test/00-new
 delete mode 100755 performance-test/01-dump-restore
 delete mode 100755 performance-test/02-tag
 create mode 100755 performance-test/T00-new
 create mode 100755 performance-test/T01-dump-restore
 create mode 100755 performance-test/T02-tag
 delete mode 100755 performance-test/notmuch-perf-test
 create mode 100755 performance-test/notmuch-time-test

diff --git a/performance-test/00-new b/performance-test/00-new
deleted file mode 100755
index 553bb8b..000
--- a/performance-test/00-new
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/bash
-
-test_description='notmuch new'
-
-. ./perf-test-lib.sh
-
-uncache_database
-
-time_start
-
-for i in $(seq 2 6); do
-time_run "notmuch new #$i" 'notmuch new'
-done
-
-time_done
diff --git a/performance-test/01-dump-restore b/performance-test/01-dump-restore
deleted file mode 100755
index b2ff940..000
--- a/performance-test/01-dump-restore
+++ /dev/null
@@ -1,13 +0,0 @@
-#!/bin/bash
-
-test_description='dump and restore'
-
-. ./perf-test-lib.sh
-
-time_start
-
-time_run 'load nmbug tags' 'notmuch restore --accumulate < 
corpus.tags/nmbug.sup-dump'
-time_run 'dump *' 'notmuch dump > tags.out'
-time_run 'restore *' 'notmuch restore < tags.out'
-
-time_done
diff --git a/performance-test/02-tag b/performance-test/02-tag
deleted file mode 100755
index 78ceccc..000
--- a/performance-test/02-tag
+++ /dev/null
@@ -1,14 +0,0 @@
-#!/bin/bash
-
-test_description='tagging'
-
-. ./perf-test-lib.sh
-
-time_start
-
-time_run 'tag * +new_tag' "notmuch tag +new_tag '*'"
-time_run 'tag * +existing_tag' "notmuch tag +new_tag '*'"
-time_run 'tag * -existing_tag' "notmuch tag -new_tag '*'"
-time_run 'tag * -missing_tag' "notmuch tag -new_tag '*'"
-
-time_done
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 3834e4d..57beb44 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -10,7 +10,7 @@ SIGFILE := ${TXZFILE}.asc
 TEST_SCRIPT := ${dir}/notmuch-perf-test
 DEFAULT_URL :=  http://notmuchmail.org/releases/${CORPUS_NAME}

-perf-test: setup-perf-test all
+time-test: setup-perf-test all
$(TEST_SCRIPT) $(OPTIONS)

 .PHONY: download-corpus setup-perf-test
diff --git a/performance-test/README b/performance-test/README
index 1481660..d1fb6de 100644
--- a/performance-test/README
+++ b/performance-test/README
@@ -36,8 +36,8 @@ for a list of mirrors.
 Running tests
 -

-The easiest way to run performance tests is to say "make perf-test", (or
-simply run the notmuch-perf-test script). Either command will run all
+The easiest way to run performance tests is to say "make time-test", (or
+simply run the notmuch-time-test script). Either command will run all
 available performance tests.

 Alternately, you can run a specific subset of tests by simply invoking
@@ -51,7 +51,7 @@ Each test script supports the following arguments
 Writing tests
 -

-Have a look at "01-dump-restore" for an example. Sourcing
+Have a look at "T01-dump-restore" for an example. Sourcing
 "perf-test-lib.sh" is mandatory.  Utility functions include

 - 'add_email_corpus' unpacks a set of messages and adds them to the database.
@@ -65,4 +65,5 @@ Have a look at "01-dump-restore" for an example. Sourcing

 Scripts are run in the order specified in notmuch-perf-test. In the
 future this order might be chosen automatically so please follow the
-convention of starting the name with two digits to specify the order.
+convention of starting the name with 'T' followed by two digits to
+specify the order.
diff --git a/performance-test/T00-new b/performance-test/T00-new
new file mode 100755
index 000..553bb8b
--- /dev/null
+++ b/performance-test/T00-new
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+test_description='notmuch new'
+
+. ./perf-test-lib.sh
+
+uncache_database
+
+time_start
+
+for i in $(seq 2 6); do
+time_run "notmuch new #$i" 'notmuch new'
+done
+
+time_done
diff --git a/performance-test/T01-dump-restore 
b/performance-test/T01-dump-restore
new file mode 100755
index 000..b2ff940
--- /dev/null
+++ 

[Patch v2 1/4] perf-test: remove redundant "initial notmuch new"

2012-12-24 Thread da...@tethera.net
From: David Bremner 

The initial notmuch-new and caching are now done automatically by
time_start
---
 performance-test/00-new |4 
 1 file changed, 4 deletions(-)

diff --git a/performance-test/00-new b/performance-test/00-new
index 6f0b50c..553bb8b 100755
--- a/performance-test/00-new
+++ b/performance-test/00-new
@@ -8,10 +8,6 @@ uncache_database

 time_start

-time_run 'initial notmuch new' 'notmuch new'
-
-cache_database
-
 for i in $(seq 2 6); do
 time_run "notmuch new #$i" 'notmuch new'
 done
-- 
1.7.10.4



v2 of valgrind based memory tests

2012-12-24 Thread da...@tethera.net
These obsolete

  id:1355196820-29734-1-git-send-email-david at tethera.net

I tried to follow the suggestions of 

  id:20121216191121.GH6187 at mit.edu

pretty closely.

diff --git a/performance-test/M00-new b/performance-test/M00-new
index 733e9b0..99c3f52 100755
--- a/performance-test/M00-new
+++ b/performance-test/M00-new
@@ -9,6 +9,7 @@ uncache_database

 memory_start

+# run 'notmuch new' a second time, to test different code paths
 memory_run "notmuch new" "notmuch new"

 memory_done
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 357d800..73aa963 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -4,7 +4,6 @@ dir := performance-test

 include $(dir)/version.sh

-# these two are just make sure dir is expanded at the right time.
 TIME_TEST_SCRIPT := ${dir}/notmuch-time-test
 MEMORY_TEST_SCRIPT := ${dir}/notmuch-memory-test

@@ -17,11 +16,11 @@ perf-test: time-test memory-test

 time-test: setup-perf-test all
@echo
-   $(TIME_TEST_SCRIPT) $(TEST_OPTIONS)
+   $(TIME_TEST_SCRIPT) $(OPTIONS)

 memory-test: setup-perf-test all
@echo
-   $(MEMORY_TEST_SCRIPT) $(TEST_OPTIONS)
+   $(MEMORY_TEST_SCRIPT) $(OPTIONS)


 .PHONY: download-corpus setup-perf-test
diff --git a/performance-test/README b/performance-test/README
index 7eaf5f7..996724c 100644
--- a/performance-test/README
+++ b/performance-test/README
@@ -1,7 +1,7 @@
 Performance Tests
 -

-This directory contains two kinds of performance tests, time tests,
+This directory contains two kinds of performance tests: time tests,
 and memory tests. The former use gnu time, and the latter use
 valgrind.

@@ -12,7 +12,7 @@ In addition to having notmuch, you need:

 - gpg
 - gnu tar
-- gnu time (for the time tests).
+- gnu time (for the time tests)
 - xz. Some speedup can be gotten by installing "pixz", but this is
   probably only worthwhile if you are debugging the tests.
 - valgrind (for the memory tests)
@@ -59,13 +59,13 @@ supports the following arguments
temporary directories.

 When using the make targets, you can pass arguments to all test
-scripts by defining the make variable TEST_OPTIONS.
+scripts by defining the make variable OPTIONS.

 Writing tests
 -

-Have a look at "T01-dump-restore" for an example time test and and
-"M00-new" for an example memory tests. In both cases sourcing
+Have a look at "T01-dump-restore" for an example time test and
+"M00-new" for an example memory test. In both cases sourcing
 "perf-test-lib.sh" is mandatory.

 Basics:
diff --git a/performance-test/perf-test-lib.sh 
b/performance-test/perf-test-lib.sh
index 79eb2c5..10d05e0 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -89,11 +89,10 @@ add_email_corpus ()

 cp -lr $TAG_CORPUS $TMP_DIRECTORY/corpus.tags
 cp -lr $MAIL_CORPUS $MAIL_DIR
-
 }

-notmuch_new_with_cache () {
-
+notmuch_new_with_cache ()
+{
 if [ -d $DB_CACHE_DIR ]; then
cp -r $DB_CACHE_DIR ${MAIL_DIR}/.notmuch
 else
@@ -102,8 +101,8 @@ notmuch_new_with_cache () {
 fi
 }

-time_start () {
-
+time_start ()
+{
 add_email_corpus

 print_header
@@ -111,17 +110,19 @@ time_start () {
 notmuch_new_with_cache time_run
 }

-memory_start () {
-
+memory_start ()
+{
 add_email_corpus

-_timestamp=$(printf "%x" $(date +"%s"))
-log_dir=$(mktemp -d "${TEST_DIRECTORY}/log.$(basename 
$0)-$corpus_size-${_timestamp}-XX")
+local timestamp=$(date +%Y%m%dT%H%M%S)
+log_dir="${TEST_DIRECTORY}/log.$(basename $0)-$corpus_size-${timestamp}"
+mkdir -p ${log_dir}

 notmuch_new_with_cache memory_run
 }

-memory_run () {
+memory_run ()
+{
 test_count=$(($test_count+1))

 log_file=$log_dir/$test_count.log
@@ -134,11 +135,13 @@ memory_run () {
 echo
 }

-memory_done () {
+memory_done ()
+{
 time_done
 }

-cache_database () {
+cache_database ()
+{
 if [ -d $MAIL_DIR/.notmuch ]; then
cp -r $MAIL_DIR/.notmuch $DB_CACHE_DIR
 else
@@ -146,16 +149,18 @@ cache_database () {
 fi
 }

-uncache_database () {
+uncache_database ()
+{
 rm -rf $DB_CACHE_DIR
 }

-print_header () {
+print_header ()
+{
 printf "\t\t\tWall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn/Out(512B)\n"
-
 }

-time_run () {
+time_run ()
+{
 printf "  %-22s" "$1"
 test_count=$(($test_count+1))
 if test "$verbose" != "t"; then exec 4>test.output 3>&4; fi
@@ -166,7 +171,8 @@ time_run () {
 return 0
 }

-time_done () {
+time_done ()
+{
 if [ "$test_failure" = "0" ]; then
rm -rf "$remove_tmp"
exit 0


[PATCH] devel: add script to run tests on a patch series.

2012-12-24 Thread da...@tethera.net
From: David Bremner 

This script is a thin wrapper around git rebase --interactive, that
allows the user to fine tune patches if they break the test suite, or
violate the coding style guidelines.

The user can always run "git rebase --continue" to ignore false positives.

I decided to use perl because inplace editing with sed is non-portable.
---

I have caught several formatting violations and at least one real bug 
using this as "last step before sending to the list".

No doubt the more talented shell-scripters / perl-hackers in the crowd
can can suggest some improvements.

 devel/check-commit |   10 ++
 devel/check-series |   33 +
 2 files changed, 43 insertions(+)
 create mode 100755 devel/check-commit
 create mode 100755 devel/check-series

diff --git a/devel/check-commit b/devel/check-commit
new file mode 100755
index 000..86648b8
--- /dev/null
+++ b/devel/check-commit
@@ -0,0 +1,10 @@
+# This script is mainly intended to be used by the check-series script
+# but you are welcome to use it as a standalone tool. It takes no
+# parameters and operates on the latest git commit (HEAD).
+
+set -e
+make test
+for file in $(git diff --name-only HEAD^ | grep '[.]\(c\|h\|cc\|hh\)'); do
+uncrustify --replace -c devel/uncrustify.cfg $file
+done
+git diff --quiet
diff --git a/devel/check-series b/devel/check-series
new file mode 100755
index 000..d48e70f
--- /dev/null
+++ b/devel/check-series
@@ -0,0 +1,33 @@
+#/bin/sh
+
+# Usage: check-series [upstream-commit]
+
+# Checks each commit in a patch series (topic branch) by running the
+# script devel/check-commit. If check-commit fails (exits with
+# non-zero status), the user is left in the middle of a git rebase, and
+# can fix the commit, e.g. using git commit --amend, followed by
+# "git rebase --continue". If all else fails, "git rebase --abort" should
+# get you back to where you started.
+#
+# NOTE: this runs "make test" many times, so it can take a while.
+#
+
+trap cleanup EXIT
+
+cleanup () {
+if [ -n "$tmpdir" ]; then
+   rm -rf $tmpdir
+fi
+}
+
+upstream=master
+if [ -n "$1" ]; then
+upstream="$1"
+fi
+
+# make sure we always run the most recent version of check-commit
+# in particular cope with it going away.
+tmpdir=$(mktemp -d)
+cp devel/check-commit $tmpdir
+
+GIT_SEQUENCE_EDITOR="perl -pi -e 's,^\s*([^#\s].*)$,\1\nexec 
$tmpdir/check-commit,'" git rebase -i $upstream
-- 
1.7.10.4



v9 of batch tagging

2012-12-24 Thread Mark Walters

On Mon, 24 Dec 2012, david at tethera.net wrote:
> This obsoletes 
>
>  id:1356095307-22895-1-git-send-email-david at tethera.net
>
> The main changes since v8 are the rebasing against the notmuch-restore
> fixes in master, and the rewrite of the query (pre)-processing
> unhex_and_quote. This incorporates the changes of
>
>   id:1356231570-28232-1-git-send-email-david at tethera.net
>
> and  now handles '()'  (cf. id:87a9t5p4dz.fsf at qmul.ac.uk)
>
> With respect to 
>
> ,
> | Finally, I don't know if a query can contain a : without being a
> | prefix query. If it can that could end up being misquoted.
> `
>
> This is pretty easy to work around by encoding that :. I think unless
> it is a problem in practice I prefer not to keep an explicity list of
> prefixes here; recognizing prefixes should really be a service from
> libnotmuch.

I am quite happy with this.

> I dropped two patches (strnspn and hex_invariant), but picked up a new
> strtok variation. Probably the name strtok_len2 could be improved
> (and I see there is a typo in the patch subject).
>
>  [Patch v9 05/17] util/string-util: add a new string tokenized
>

Patches 5 and 6 look good to me.

> Finally I added a test for the new parenthesis handling.

My recollection is that dump prints the messages unsorted: does this
mean that we could get unstable results for these tests (eg with
different Xapian versions)? 

Best wishes

Mark

>
> [Patch v9 17/17] test/tagging: add test for handling of parens
>


> Fixup wise, the tests needed to be adjusted a bit for () being delimiters, 
> and the man page as well.
>
> I added the fclose in id:87wqw9hf9a.fsf at oiva.home.nikula.org
>
> And I modified the return value per id:87zk15hi7f.fsf at oiva.home.nikula.org
>
> Here is the interdiff for unhex_and_quote:
>
> commit 67c6aee87db5c7da25529e1c0feb64e422abb4b7
> Author: David Bremner 
> Date:   Sat Dec 22 22:49:02 2012 -0400
>
> simplify unhex_and_quote, support parens
> 
> the overgeneral definition of a prefix can be replaced by lower case
> alphabetic, and still work fine with current notmuch query syntax.
> 
> use () as delimiters in unhex_and_quote, preserve delimiters
>
> diff --git a/tag-util.c b/tag-util.c
> index 6f62fe6..91f3603 100644
> --- a/tag-util.c
> +++ b/tag-util.c
> @@ -56,6 +56,21 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
>  return NULL;
>  }
>  
> +/* Factor out the boilerplate to append a token to the query string.
> + * For use in unhex_and_quote */
> +
> +static tag_parse_status_t
> +append_tok (const char *tok, size_t tok_len,
> + const char *line_for_error, char **query_string)
> +{
> +
> +*query_string = talloc_strndup_append_buffer (*query_string, tok, 
> tok_len);
> +if (*query_string == NULL)
> + return line_error (TAG_PARSE_OUT_OF_MEMORY, line_for_error, "aborting");
> +
> +return TAG_PARSE_SUCCESS;
> +}
> +
>  /* Input is a hex encoded string, presumed to be a query for Xapian.
>   *
>   * Space delimited tokens are decoded and quoted, with '*' and prefixes
> @@ -67,45 +82,41 @@ unhex_and_quote (void *ctx, char *encoded, const char 
> *line_for_error,
>  {
>  char *tok = encoded;
>  size_t tok_len = 0;
> +size_t delim_len = 0;
>  char *buf = NULL;
>  size_t buf_len = 0;
>  tag_parse_status_t ret = TAG_PARSE_SUCCESS;
>  
>  *query_string = talloc_strdup (ctx, "");
>  
> -while ((tok = strtok_len (tok + tok_len, " ", _len)) != NULL) {
> +while ((tok = strtok_len2 (tok + tok_len + delim_len, " ()",
> +_len, _len)) != NULL) {
>  
>   size_t prefix_len;
>   char delim = *(tok + tok_len);
>  
> - *(tok + tok_len++) = '\0';
> + *(tok + tok_len) = '\0';
>  
> - prefix_len = hex_invariant (tok, tok_len);
> + /* The following matches a superset of prefixes currently
> +  * used by notmuch */
> + prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz");
>  
> - if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) {
> + if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) {
>  
>   /* pass some things through without quoting or decoding.
>* Note for '*' this is mandatory.
>*/
>  
> - if (! (*query_string = talloc_asprintf_append_buffer (
> -*query_string, "%s%c", tok, delim))) {
> -
> - ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> -   line_for_error, "aborting");
> - goto DONE;
> - }
> + ret = append_tok (tok, tok_len, line_for_error, query_string);
> + if (ret) goto DONE;
>  
>   } else {
>   /* potential prefix: one for ':', then something after */
> - if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') {
> - if (! (*query_string = talloc_strndup_append (*query_string,
> -   tok,
> -  

[PATCH] devel: add script to run tests on a patch series.

2012-12-24 Thread david
From: David Bremner brem...@debian.org

This script is a thin wrapper around git rebase --interactive, that
allows the user to fine tune patches if they break the test suite, or
violate the coding style guidelines.

The user can always run git rebase --continue to ignore false positives.

I decided to use perl because inplace editing with sed is non-portable.
---

I have caught several formatting violations and at least one real bug 
using this as last step before sending to the list.

No doubt the more talented shell-scripters / perl-hackers in the crowd
can can suggest some improvements.

 devel/check-commit |   10 ++
 devel/check-series |   33 +
 2 files changed, 43 insertions(+)
 create mode 100755 devel/check-commit
 create mode 100755 devel/check-series

diff --git a/devel/check-commit b/devel/check-commit
new file mode 100755
index 000..86648b8
--- /dev/null
+++ b/devel/check-commit
@@ -0,0 +1,10 @@
+# This script is mainly intended to be used by the check-series script
+# but you are welcome to use it as a standalone tool. It takes no
+# parameters and operates on the latest git commit (HEAD).
+
+set -e
+make test
+for file in $(git diff --name-only HEAD^ | grep '[.]\(c\|h\|cc\|hh\)'); do
+uncrustify --replace -c devel/uncrustify.cfg $file
+done
+git diff --quiet
diff --git a/devel/check-series b/devel/check-series
new file mode 100755
index 000..d48e70f
--- /dev/null
+++ b/devel/check-series
@@ -0,0 +1,33 @@
+#/bin/sh
+
+# Usage: check-series [upstream-commit]
+
+# Checks each commit in a patch series (topic branch) by running the
+# script devel/check-commit. If check-commit fails (exits with
+# non-zero status), the user is left in the middle of a git rebase, and
+# can fix the commit, e.g. using git commit --amend, followed by
+# git rebase --continue. If all else fails, git rebase --abort should
+# get you back to where you started.
+#
+# NOTE: this runs make test many times, so it can take a while.
+#
+
+trap cleanup EXIT
+
+cleanup () {
+if [ -n $tmpdir ]; then
+   rm -rf $tmpdir
+fi
+}
+
+upstream=master
+if [ -n $1 ]; then
+upstream=$1
+fi
+
+# make sure we always run the most recent version of check-commit
+# in particular cope with it going away.
+tmpdir=$(mktemp -d)
+cp devel/check-commit $tmpdir
+
+GIT_SEQUENCE_EDITOR=perl -pi -e 's,^\s*([^#\s].*)$,\1\nexec 
$tmpdir/check-commit,' git rebase -i $upstream
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[Patch v2 1/4] perf-test: remove redundant initial notmuch new

2012-12-24 Thread david
From: David Bremner brem...@debian.org

The initial notmuch-new and caching are now done automatically by
time_start
---
 performance-test/00-new |4 
 1 file changed, 4 deletions(-)

diff --git a/performance-test/00-new b/performance-test/00-new
index 6f0b50c..553bb8b 100755
--- a/performance-test/00-new
+++ b/performance-test/00-new
@@ -8,10 +8,6 @@ uncache_database
 
 time_start
 
-time_run 'initial notmuch new' 'notmuch new'
-
-cache_database
-
 for i in $(seq 2 6); do
 time_run notmuch new #$i 'notmuch new'
 done
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


v2 of valgrind based memory tests

2012-12-24 Thread david
These obsolete
  
  id:1355196820-29734-1-git-send-email-da...@tethera.net

I tried to follow the suggestions of 

  id:20121216191121.gh6...@mit.edu

pretty closely.

diff --git a/performance-test/M00-new b/performance-test/M00-new
index 733e9b0..99c3f52 100755
--- a/performance-test/M00-new
+++ b/performance-test/M00-new
@@ -9,6 +9,7 @@ uncache_database
 
 memory_start
 
+# run 'notmuch new' a second time, to test different code paths
 memory_run notmuch new notmuch new
 
 memory_done
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 357d800..73aa963 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -4,7 +4,6 @@ dir := performance-test
 
 include $(dir)/version.sh
 
-# these two are just make sure dir is expanded at the right time.
 TIME_TEST_SCRIPT := ${dir}/notmuch-time-test
 MEMORY_TEST_SCRIPT := ${dir}/notmuch-memory-test
 
@@ -17,11 +16,11 @@ perf-test: time-test memory-test
 
 time-test: setup-perf-test all
@echo
-   $(TIME_TEST_SCRIPT) $(TEST_OPTIONS)
+   $(TIME_TEST_SCRIPT) $(OPTIONS)
 
 memory-test: setup-perf-test all
@echo
-   $(MEMORY_TEST_SCRIPT) $(TEST_OPTIONS)
+   $(MEMORY_TEST_SCRIPT) $(OPTIONS)
 
 
 .PHONY: download-corpus setup-perf-test
diff --git a/performance-test/README b/performance-test/README
index 7eaf5f7..996724c 100644
--- a/performance-test/README
+++ b/performance-test/README
@@ -1,7 +1,7 @@
 Performance Tests
 -
 
-This directory contains two kinds of performance tests, time tests,
+This directory contains two kinds of performance tests: time tests,
 and memory tests. The former use gnu time, and the latter use
 valgrind.
 
@@ -12,7 +12,7 @@ In addition to having notmuch, you need:
 
 - gpg
 - gnu tar
-- gnu time (for the time tests).
+- gnu time (for the time tests)
 - xz. Some speedup can be gotten by installing pixz, but this is
   probably only worthwhile if you are debugging the tests.
 - valgrind (for the memory tests)
@@ -59,13 +59,13 @@ supports the following arguments
temporary directories.
 
 When using the make targets, you can pass arguments to all test
-scripts by defining the make variable TEST_OPTIONS.
+scripts by defining the make variable OPTIONS.
 
 Writing tests
 -
 
-Have a look at T01-dump-restore for an example time test and and
-M00-new for an example memory tests. In both cases sourcing
+Have a look at T01-dump-restore for an example time test and
+M00-new for an example memory test. In both cases sourcing
 perf-test-lib.sh is mandatory.
 
 Basics:
diff --git a/performance-test/perf-test-lib.sh 
b/performance-test/perf-test-lib.sh
index 79eb2c5..10d05e0 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -89,11 +89,10 @@ add_email_corpus ()
 
 cp -lr $TAG_CORPUS $TMP_DIRECTORY/corpus.tags
 cp -lr $MAIL_CORPUS $MAIL_DIR
-
 }
 
-notmuch_new_with_cache () {
-
+notmuch_new_with_cache ()
+{
 if [ -d $DB_CACHE_DIR ]; then
cp -r $DB_CACHE_DIR ${MAIL_DIR}/.notmuch
 else
@@ -102,8 +101,8 @@ notmuch_new_with_cache () {
 fi
 }
 
-time_start () {
-
+time_start ()
+{
 add_email_corpus
 
 print_header
@@ -111,17 +110,19 @@ time_start () {
 notmuch_new_with_cache time_run
 }
 
-memory_start () {
-
+memory_start ()
+{
 add_email_corpus
 
-_timestamp=$(printf %x $(date +%s))
-log_dir=$(mktemp -d ${TEST_DIRECTORY}/log.$(basename 
$0)-$corpus_size-${_timestamp}-XX)
+local timestamp=$(date +%Y%m%dT%H%M%S)
+log_dir=${TEST_DIRECTORY}/log.$(basename $0)-$corpus_size-${timestamp}
+mkdir -p ${log_dir}
 
 notmuch_new_with_cache memory_run
 }
 
-memory_run () {
+memory_run ()
+{
 test_count=$(($test_count+1))
 
 log_file=$log_dir/$test_count.log
@@ -134,11 +135,13 @@ memory_run () {
 echo
 }
 
-memory_done () {
+memory_done ()
+{
 time_done
 }
 
-cache_database () {
+cache_database ()
+{
 if [ -d $MAIL_DIR/.notmuch ]; then
cp -r $MAIL_DIR/.notmuch $DB_CACHE_DIR
 else
@@ -146,16 +149,18 @@ cache_database () {
 fi
 }
 
-uncache_database () {
+uncache_database ()
+{
 rm -rf $DB_CACHE_DIR
 }
 
-print_header () {
+print_header ()
+{
 printf \t\t\tWall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn/Out(512B)\n
-
 }
 
-time_run () {
+time_run ()
+{
 printf   %-22s $1
 test_count=$(($test_count+1))
 if test $verbose != t; then exec 4test.output 34; fi
@@ -166,7 +171,8 @@ time_run () {
 return 0
 }
 
-time_done () {
+time_done ()
+{
 if [ $test_failure = 0 ]; then
rm -rf $remove_tmp
exit 0
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[Patch v2 2/4] perf-test: rename current tests as time tests

2012-12-24 Thread david
From: David Bremner brem...@debian.org

This is almost entirely renaming files, except for updating a few
references to those file names, and changing the makefile target.

A new set of memory tests will be run separately because they take
much longer.
---
 performance-test/00-new|   15 ---
 performance-test/01-dump-restore   |   13 -
 performance-test/02-tag|   14 --
 performance-test/Makefile.local|2 +-
 performance-test/README|9 +
 performance-test/T00-new   |   15 +++
 performance-test/T01-dump-restore  |   13 +
 performance-test/T02-tag   |   14 ++
 performance-test/notmuch-perf-test |   27 ---
 performance-test/notmuch-time-test |   27 +++
 10 files changed, 75 insertions(+), 74 deletions(-)
 delete mode 100755 performance-test/00-new
 delete mode 100755 performance-test/01-dump-restore
 delete mode 100755 performance-test/02-tag
 create mode 100755 performance-test/T00-new
 create mode 100755 performance-test/T01-dump-restore
 create mode 100755 performance-test/T02-tag
 delete mode 100755 performance-test/notmuch-perf-test
 create mode 100755 performance-test/notmuch-time-test

diff --git a/performance-test/00-new b/performance-test/00-new
deleted file mode 100755
index 553bb8b..000
--- a/performance-test/00-new
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/bash
-
-test_description='notmuch new'
-
-. ./perf-test-lib.sh
-
-uncache_database
-
-time_start
-
-for i in $(seq 2 6); do
-time_run notmuch new #$i 'notmuch new'
-done
-
-time_done
diff --git a/performance-test/01-dump-restore b/performance-test/01-dump-restore
deleted file mode 100755
index b2ff940..000
--- a/performance-test/01-dump-restore
+++ /dev/null
@@ -1,13 +0,0 @@
-#!/bin/bash
-
-test_description='dump and restore'
-
-. ./perf-test-lib.sh
-
-time_start
-
-time_run 'load nmbug tags' 'notmuch restore --accumulate  
corpus.tags/nmbug.sup-dump'
-time_run 'dump *' 'notmuch dump  tags.out'
-time_run 'restore *' 'notmuch restore  tags.out'
-
-time_done
diff --git a/performance-test/02-tag b/performance-test/02-tag
deleted file mode 100755
index 78ceccc..000
--- a/performance-test/02-tag
+++ /dev/null
@@ -1,14 +0,0 @@
-#!/bin/bash
-
-test_description='tagging'
-
-. ./perf-test-lib.sh
-
-time_start
-
-time_run 'tag * +new_tag' notmuch tag +new_tag '*'
-time_run 'tag * +existing_tag' notmuch tag +new_tag '*'
-time_run 'tag * -existing_tag' notmuch tag -new_tag '*'
-time_run 'tag * -missing_tag' notmuch tag -new_tag '*'
-
-time_done
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 3834e4d..57beb44 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -10,7 +10,7 @@ SIGFILE := ${TXZFILE}.asc
 TEST_SCRIPT := ${dir}/notmuch-perf-test
 DEFAULT_URL :=  http://notmuchmail.org/releases/${CORPUS_NAME}
 
-perf-test: setup-perf-test all
+time-test: setup-perf-test all
$(TEST_SCRIPT) $(OPTIONS)
 
 .PHONY: download-corpus setup-perf-test
diff --git a/performance-test/README b/performance-test/README
index 1481660..d1fb6de 100644
--- a/performance-test/README
+++ b/performance-test/README
@@ -36,8 +36,8 @@ for a list of mirrors.
 Running tests
 -
 
-The easiest way to run performance tests is to say make perf-test, (or
-simply run the notmuch-perf-test script). Either command will run all
+The easiest way to run performance tests is to say make time-test, (or
+simply run the notmuch-time-test script). Either command will run all
 available performance tests.
 
 Alternately, you can run a specific subset of tests by simply invoking
@@ -51,7 +51,7 @@ Each test script supports the following arguments
 Writing tests
 -
 
-Have a look at 01-dump-restore for an example. Sourcing
+Have a look at T01-dump-restore for an example. Sourcing
 perf-test-lib.sh is mandatory.  Utility functions include
 
 - 'add_email_corpus' unpacks a set of messages and adds them to the database.
@@ -65,4 +65,5 @@ Have a look at 01-dump-restore for an example. Sourcing
 
 Scripts are run in the order specified in notmuch-perf-test. In the
 future this order might be chosen automatically so please follow the
-convention of starting the name with two digits to specify the order.
+convention of starting the name with 'T' followed by two digits to
+specify the order.
diff --git a/performance-test/T00-new b/performance-test/T00-new
new file mode 100755
index 000..553bb8b
--- /dev/null
+++ b/performance-test/T00-new
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+test_description='notmuch new'
+
+. ./perf-test-lib.sh
+
+uncache_database
+
+time_start
+
+for i in $(seq 2 6); do
+time_run notmuch new #$i 'notmuch new'
+done
+
+time_done
diff --git a/performance-test/T01-dump-restore 
b/performance-test/T01-dump-restore
new file mode 100755
index 000..b2ff940
--- /dev/null
+++ 

[Patch v2 3/4] perf-test: initial version of memory test infrastructure.

2012-12-24 Thread david
From: David Bremner brem...@debian.org

The idea is run some code under valgrind --leak-check=full and report
a summary, leaving the user to peruse the log file if they want.

We go to some lengths to preserve the log files from accidental
overwriting; the full corpus takes about 3 hours to run under valgrind
on my machine.

The naming of the log directories may be slightly controversial; in
the unlikely event of two runs in less than a second, the log will be
overwritten. A previous version with mktemp+timestamp was dismissed as
overkill; just mktemp alone does not sort nicely.

One new test is included, to check notmuch new for memory leaks.
---
 performance-test/.gitignore   |1 +
 performance-test/M00-new  |   15 
 performance-test/Makefile.local   |   16 ++--
 performance-test/README   |   57 +++-
 performance-test/perf-test-lib.sh |   75 -
 5 files changed, 126 insertions(+), 38 deletions(-)
 create mode 100755 performance-test/M00-new

diff --git a/performance-test/.gitignore b/performance-test/.gitignore
index 6421a9a..f3f9be4 100644
--- a/performance-test/.gitignore
+++ b/performance-test/.gitignore
@@ -1,3 +1,4 @@
 tmp.*/
+log.*/
 corpus/
 notmuch.cache.*/
diff --git a/performance-test/M00-new b/performance-test/M00-new
new file mode 100755
index 000..99c3f52
--- /dev/null
+++ b/performance-test/M00-new
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+test_description='notmuch new'
+
+. ./perf-test-lib.sh
+
+# ensure initial 'notmuch new' is run by memory_start
+uncache_database
+
+memory_start
+
+# run 'notmuch new' a second time, to test different code paths
+memory_run notmuch new notmuch new
+
+memory_done
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 57beb44..73aa963 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -4,14 +4,24 @@ dir := performance-test
 
 include $(dir)/version.sh
 
+TIME_TEST_SCRIPT := ${dir}/notmuch-time-test
+MEMORY_TEST_SCRIPT := ${dir}/notmuch-memory-test
+
 CORPUS_NAME := notmuch-email-corpus-$(PERFTEST_VERSION).tar.xz
 TXZFILE := ${dir}/download/${CORPUS_NAME}
 SIGFILE := ${TXZFILE}.asc
-TEST_SCRIPT := ${dir}/notmuch-perf-test
 DEFAULT_URL :=  http://notmuchmail.org/releases/${CORPUS_NAME}
 
+perf-test: time-test memory-test
+
 time-test: setup-perf-test all
-   $(TEST_SCRIPT) $(OPTIONS)
+   @echo
+   $(TIME_TEST_SCRIPT) $(OPTIONS)
+
+memory-test: setup-perf-test all
+   @echo
+   $(MEMORY_TEST_SCRIPT) $(OPTIONS)
+
 
 .PHONY: download-corpus setup-perf-test
 
@@ -29,4 +39,4 @@ $(TXZFILE):
 download-corpus:
wget -O ${TXZFILE} ${DEFAULT_URL}
 
-CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus $(dir)/notmuch.cache.*
+CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/log.* $(dir)/corpus 
$(dir)/notmuch.cache.*
diff --git a/performance-test/README b/performance-test/README
index d1fb6de..996724c 100644
--- a/performance-test/README
+++ b/performance-test/README
@@ -1,3 +1,10 @@
+Performance Tests
+-
+
+This directory contains two kinds of performance tests: time tests,
+and memory tests. The former use gnu time, and the latter use
+valgrind.
+
 Pre-requisites
 --
 
@@ -5,9 +12,10 @@ In addition to having notmuch, you need:
 
 - gpg
 - gnu tar
-- gnu time
+- gnu time (for the time tests)
 - xz. Some speedup can be gotten by installing pixz, but this is
   probably only worthwhile if you are debugging the tests.
+- valgrind (for the memory tests)
 
 Getting set up to run tests:
 
@@ -36,34 +44,47 @@ for a list of mirrors.
 Running tests
 -
 
-The easiest way to run performance tests is to say make time-test, (or
-simply run the notmuch-time-test script). Either command will run all
-available performance tests.
-
-Alternately, you can run a specific subset of tests by simply invoking
-one of the executable scripts in this directory, (such as ./basic).
-Each test script supports the following arguments
+The easiest way to run performance tests is to say make perf-test.
+This will run all time and memory tests.  Be aware that the memory
+tests are quite time consuming when run on the full corpus, and that
+depending on your interests it may be more sensible to run make
+time-test or make memory-test.  You can also invoke one of the
+scripts notmuch-time-test or notmuch-memory-test or run a more
+specific subset of tests by simply invoking one of the executable
+scripts in this directory, (such as ./T00-new).  Each test script
+supports the following arguments
 
 --small / --medium / --large   Choose corpus size.
 --debugEnable debugging. In particular don't 
delete
temporary directories.
 
+When using the make targets, you can pass arguments to all test
+scripts by defining the make variable OPTIONS.
+
 Writing tests
 -
 
-Have a look at T01-dump-restore for an 

[Patch v2 4/4] perf-test: add memory leak test for dump restore

2012-12-24 Thread david
From: David Bremner brem...@debian.org

In id:87vcc2q5n2@nikula.org, Jani points out a memory leak in the
current version of the sup restore code. Among other things, this test
is intended to verify a fix for that leak.
---
 performance-test/M01-dump-restore |   15 +++
 1 file changed, 15 insertions(+)
 create mode 100755 performance-test/M01-dump-restore

diff --git a/performance-test/M01-dump-restore 
b/performance-test/M01-dump-restore
new file mode 100755
index 000..be5894a
--- /dev/null
+++ b/performance-test/M01-dump-restore
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+test_description='dump and restore'
+
+. ./perf-test-lib.sh
+
+memory_start
+
+memory_run 'load nmbug tags' 'notmuch restore --accumulate 
--input=corpus.tags/nmbug.sup-dump'
+memory_run 'dump *' 'notmuch dump --output=tags.sup'
+memory_run 'restore *' 'notmuch restore --input=tags.sup'
+memory_run 'dump --format=batch-tag *' 'notmuch dump --format=batch-tag 
--output=tags.bt'
+memory_run 'restore --format=batch-tag *' 'notmuch restore --format=batch-tag 
--input=tags.bt'
+
+memory_done
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] contrib: pick: slightly tweak running search and pick from pick buffer

2012-12-24 Thread David Bremner
Mark Walters markwalters1...@gmail.com writes:

 Previously running search or pick from the pick buffer did not close
 the message pane (if open). This meant that then new search ends up in
 a very small window. Fix this so that the message pane is
 shut. However, make it so that the pane is shut after the search
 string is entered in case the user is basing the search on something
 in the current message.
 ---

Seems fine to me, queuing to push.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] emacs: show: make id links respect window

2012-12-24 Thread David Bremner
Mark Walters markwalters1...@gmail.com writes:

 I think this is a bug but that could be debated. It is particularly
 easy to trigger with notmuch pick because that uses the split pane
 while focus usually remains in the `pick' pane rather than the `show'
 pane.

I can imagine that people would want/like the open in other window 
effect of the current code, even if the reason is a bug.

 'action `(lambda (arg)
(notmuch-show ,(third link)))

Can you (or someone) explain why backquote is needed here? Is this some
kind of workaround for lack of lexical scope? If so, maybe a comment.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] _notmuch_message_index_file: unref (free) address lists from gmime.

2012-12-24 Thread Austin Clements
This patch LGTM.

On Mon, 10 Dec 2012, da...@tethera.net wrote:
 From: David Bremner brem...@debian.org

 Apparently as of GMime 2.4, you don't need to call
 internet_address_list_destroy anymore, but you still need to call
 g_object_unref (from the GMime Changelog).

 On the medium performance corpus, valgrind shows possibly lost
 leakage in notmuch new dropping from 7M to 300k.
 ---
  lib/index.cc |   12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

 diff --git a/lib/index.cc b/lib/index.cc
 index da0e6ce..a2edd6d 100644
 --- a/lib/index.cc
 +++ b/lib/index.cc
 @@ -484,12 +484,18 @@ mboxes is deprecated and may be removed in the 
 future.\n, filename);
  }
  
  from = g_mime_message_get_sender (mime_message);
 -addresses = internet_address_list_parse_string (from);
  
 -_index_address_list (message, from, addresses);
 +addresses = internet_address_list_parse_string (from);
 +if (addresses) {
 + _index_address_list (message, from, addresses);
 + g_object_unref (addresses);
 +}
  
  addresses = g_mime_message_get_all_recipients (mime_message);
 -_index_address_list (message, to, addresses);
 +if (addresses) {
 + _index_address_list (message, to, addresses);
 + g_object_unref (addresses);
 +}
  
  subject = g_mime_message_get_subject (mime_message);
  _notmuch_message_gen_terms (message, subject, subject);
 -- 
 1.7.10.4

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 1/1] lib/message-file.c: use g_malloc () g_free () in hash table values

2012-12-24 Thread Austin Clements
LGTM.

On Fri, 21 Dec 2012, Tomi Ollila tomi.oll...@iki.fi wrote:
 The message-headers hash table values get data returned by
 g_mime_utils_header_decode_text ().

 The pointer returned by g_mime_utils_header_decode_text is from the
 following line in rfc2047_decode_tokens

 return g_string_free (decoded, FALSE);

 The docs for g_string_free say

  Frees the memory allocated for the GString. If free_segment is TRUE
  it also frees the character data. If it's FALSE, the caller gains
  ownership of the buffer and must free it after use with g_free().

 The remaining frees and allocations referencing to message-headers hash
 values have been changed to use g_free and g_malloc functions.

 This combines and completes the changes started by David Bremner.
 ---

 This was meant to be in reply to id:87mwxkptqn.fsf@zancas.localnet
 but I fumbled it. ;)

  lib/message-file.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

 diff --git a/lib/message-file.c b/lib/message-file.c
 index 915aba8..4d9af89 100644
 --- a/lib/message-file.c
 +++ b/lib/message-file.c
 @@ -111,7 +111,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char 
 *filename)
  message-headers = g_hash_table_new_full (strcase_hash,
 strcase_equal,
 free,
 -   free);
 +   g_free);
  
  message-parsing_started = 0;
  message-parsing_finished = 0;
 @@ -337,11 +337,11 @@ notmuch_message_file_get_header (notmuch_message_file_t 
 *message,
   /* we need to add the header to those we already collected */
   newhdr = strlen(decoded_value);
   hdrsofar = strlen(header_sofar);
 - combined_header = xmalloc(hdrsofar + newhdr + 2);
 + combined_header = g_malloc(hdrsofar + newhdr + 2);
   strncpy(combined_header,header_sofar,hdrsofar);
   *(combined_header+hdrsofar) = ' ';
   strncpy(combined_header+hdrsofar+1,decoded_value,newhdr+1);
 - free (decoded_value);
 + g_free (decoded_value);
   g_hash_table_insert (message-headers, header, combined_header);
   }
   } else {
 @@ -350,7 +350,7 @@ notmuch_message_file_get_header (notmuch_message_file_t 
 *message,
   g_hash_table_insert (message-headers, header, decoded_value);
   } else {
   free (header);
 - free (decoded_value);
 + g_free (decoded_value);
   decoded_value = header_sofar;
   }
   }
 -- 
 1.8.0

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] contrib: pick: slightly tweak running search and pick from pick buffer

2012-12-24 Thread David Bremner
Mark Walters markwalters1...@gmail.com writes:

 Previously running search or pick from the pick buffer did not close
 the message pane (if open). This meant that then new search ends up in
 a very small window. Fix this so that the message pane is
 shut. However, make it so that the pane is shut after the search
 string is entered in case the user is basing the search on something
 in the current message.

pushed,

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] _notmuch_message_index_file: unref (free) address lists from gmime.

2012-12-24 Thread David Bremner
da...@tethera.net writes:

 From: David Bremner brem...@debian.org

 Apparently as of GMime 2.4, you don't need to call
 internet_address_list_destroy anymore, but you still need to call
 g_object_unref (from the GMime Changelog).


pushed, 

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 1/1] lib/message-file.c: use g_malloc () g_free () in hash table values

2012-12-24 Thread David Bremner
Tomi Ollila tomi.oll...@iki.fi writes:


 The remaining frees and allocations referencing to message-headers hash
 values have been changed to use g_free and g_malloc functions.


pushed,

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 0/5] Use Xapian query syntax for batch-tag dump/restore

2012-12-24 Thread Austin Clements
This is a stab at tweaking the new batch-tag dump/restore format to be
more amenable to batch tagging.  Currently, the query part of each
line isn't really a Xapian query; it's a hex-encoded message ID
prefixed with id:.  This is fine for the very limited case of
dump/restore, but it extends poorly to the general queries allowed by
batch tagging.  However, Xapian already has a perfectly good quoting
syntax.  This series switches the batch-tag format to use regular
Xapian queries, using only regular Xapian quoting syntax, so that we
don't have to invent yet another quoting syntax for batch tagging.

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 3/5] dump: Disallow \n in message IDs

2012-12-24 Thread Austin Clements
When we switch to using regular Xapian queries in the dump format, \n
will cause problems, so we disallow it.  Specially, while Xapian can
quote and parse queries containing \n without difficultly, quoted
queries containing \n still span multiple lines, which breaks the
line-orientedness of the dump format.  Strictly speaking, we could
still round-trip these, but it would significantly complicate restore
as well as scripts that deal with tag dumps.  This complexity would
come at absolutely no benefit: because of the RFC 2822 unfolding
rules, no amount of standards negligence can produce a message with a
message ID containing a line break (not even Outlook can do it!).

Hence, we simply disallow it.
---
 notmuch-dump.c   |9 +
 test/random-corpus.c |2 ++
 2 files changed, 11 insertions(+)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index d2dad40..29d79da 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -132,6 +132,15 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
if (output_format == DUMP_FORMAT_SUP) {
fputs ()\n, output);
} else {
+   if (strchr (message_id, '\n')) {
+   /* This will produce a line break in the output, which
+* would be difficult to handle in tools.  However,
+* it's also impossible to produce an email containing
+* a line break in a message ID because of unfolding,
+* so we can safely disallow it. */
+   fprintf (stderr, Error: cannot dump message id containing line 
break: %s\n, message_id);
+   return 1;
+   }
if (hex_encode (notmuch, message_id,
buffer, buffer_size) != HEX_SUCCESS) {
fprintf (stderr, Error: failed to hex-encode msg-id %s\n,
diff --git a/test/random-corpus.c b/test/random-corpus.c
index f354d4b..d0e3e8f 100644
--- a/test/random-corpus.c
+++ b/test/random-corpus.c
@@ -97,6 +97,8 @@ random_utf8_string (void *ctx, size_t char_count)
}
 
randomchar = random_unichar ();
+   if (randomchar == '\n')
+   randomchar = 'x';
 
written = g_unichar_to_utf8 (randomchar, buf + offset);
 
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 5/5] man: Update notmuch-dump(1) for new batch-tag format

2012-12-24 Thread Austin Clements
---
 man/man1/notmuch-dump.1 |   11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/man/man1/notmuch-dump.1 b/man/man1/notmuch-dump.1
index 770b00f..7bd6def 100644
--- a/man/man1/notmuch-dump.1
+++ b/man/man1/notmuch-dump.1
@@ -64,13 +64,16 @@ and tags containing whitespace or non-\fBascii\fR(7) 
characters.
 Each line has the form
 
 .RS 4
-.RI + encoded-tag+ encoded-tag  ... --id: 
encoded-message-id 
+.RI + encoded-tag+ encoded-tag  ... --id: 
quoted-message-id 
 
-where encoded means that every byte not matching the regex
+Tags are hex-encoded by replacing every byte not matching the regex
 .B [A-Za-z0-9@=.,_+-]
-is replace by
+with
 .B %nn
-where nn is the two digit hex encoding.
+where nn is the two digit hex encoding.  The message ID is a valid Xapian
+query, quoted using Xapian boolean term quoting rules: if the ID contains
+whitespace or a close paren or starts with a double quote, it must be
+enclosed in double quotes and double quotes inside the ID must be doubled.
 The astute reader will notice this is a special case of the batch input
 format for \fBnotmuch-tag\fR(1); note that the single message-id query is
 mandatory for \fBnotmuch-restore\fR(1).
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 2/5] util: Function to parse boolean term queries

2012-12-24 Thread Austin Clements
This reproduces Xapian's parsing rules for boolean term queries.  This
is provided as a generic string utility, but will be used shortly in
notmuch restore to parse and optimize for ID queries.
---
 util/string-util.c |   63 
 util/string-util.h |   11 +
 2 files changed, 74 insertions(+)

diff --git a/util/string-util.c b/util/string-util.c
index 161a4dd..eaa6c99 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -94,3 +94,66 @@ make_boolean_term (void *ctx, const char *prefix, const char 
*term,
 
 return 0;
 }
+
+static int
+consume_double_quote (const char **str)
+{
+if (**str == '') {
+   ++*str;
+   return 1;
+} else if (strncmp(*str, \xe2\x80\x9c, 3) == 0 || /* UTF8 0x201c */
+  strncmp(*str, \xe2\x80\x9d, 3) == 0) { /* UTF8 0x201d */
+   *str += 3;
+   return 3;
+} else {
+   return 0;
+}
+}
+
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out)
+{
+*prefix_out = *term_out = NULL;
+
+/* Parse prefix */
+const char *pos = strchr (str, ':');
+if (! pos)
+   goto FAIL;
+*prefix_out = talloc_strndup (ctx, str, pos - str);
+++pos;
+
+/* Implement Xapian's boolean term de-quoting.  This is a nearly
+ * direct translation of QueryParser::Internal::parse_query. */
+pos = *term_out = talloc_strdup (ctx, pos);
+if (consume_double_quote (pos)) {
+   char *out = talloc_strdup (ctx, pos);
+   pos = *term_out = out;
+   while (1) {
+   if (! *pos) {
+   /* Premature end of string */
+   goto FAIL;
+   } else if (*pos == '') {
+   if (*++pos != '')
+   break;
+   } else if (consume_double_quote (pos)) {
+   break;
+   }
+   *out++ = *pos++;
+   }
+   if (*pos)
+   goto FAIL;
+   *out = '\0';
+} else {
+   while (*pos  ' '  *pos != ')')
+   ++pos;
+   if (*pos)
+   goto FAIL;
+}
+return 0;
+
+ FAIL:
+talloc_free (*prefix_out);
+talloc_free (*term_out);
+return 1;
+}
diff --git a/util/string-util.h b/util/string-util.h
index 7475e2c..e4e4c42 100644
--- a/util/string-util.h
+++ b/util/string-util.h
@@ -28,4 +28,15 @@ char *strtok_len (char *s, const char *delim, size_t *len);
 int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term,
   char **buf, size_t *len);
 
+/* Parse a boolean term query, returning the prefix in *prefix_out and
+ * the term in *term_out.  *prefix_out and *term_out will be talloc'd
+ * with context ctx.
+ *
+ * Return: 0 on success, non-zero on parse error (including trailing
+ * data in str).
+ */
+int
+parse_boolean_term (void *ctx, const char *str,
+   char **prefix_out, char **term_out);
+
 #endif
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 4/5] dump/restore: Use Xapian queries for batch-tag format

2012-12-24 Thread Austin Clements
This switches the new batch-tag format away from using a home-grown
hex-encoding scheme for message IDs in the dump to simply using Xapian
queries with Xapian quoting syntax.

This has a variety of advantages beyond presenting a cleaner and more
consistent interface.  Foremost is that it will dramatically simplify
the quoting for batch tagging, which shares the same input format.
While the hex-encoding is no better or worse for the simple ID queries
used by dump/restore, it becomes onerous for general-purpose queries
used in batch tagging.  It also better handles strange cases like
id:foo and bar, since this is no longer syntactically valid.
---
 notmuch-dump.c|9 +
 notmuch-restore.c |   22 ++
 tag-util.c|6 --
 test/dump-restore |   11 +--
 4 files changed, 20 insertions(+), 28 deletions(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index 29d79da..bf01a39 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -20,6 +20,7 @@
 
 #include notmuch-client.h
 #include dump-restore-private.h
+#include string-util.h
 
 int
 notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
@@ -141,13 +142,13 @@ notmuch_dump_command (unused (void *ctx), int argc, char 
*argv[])
fprintf (stderr, Error: cannot dump message id containing line 
break: %s\n, message_id);
return 1;
}
-   if (hex_encode (notmuch, message_id,
-   buffer, buffer_size) != HEX_SUCCESS) {
-   fprintf (stderr, Error: failed to hex-encode msg-id %s\n,
+   if (make_boolean_term (notmuch, id, message_id,
+  buffer, buffer_size)) {
+   fprintf (stderr, Error: failed to quote message id %s\n,
 message_id);
return 1;
}
-   fprintf (output,  -- id:%s\n, buffer);
+   fprintf (output,  -- %s\n, buffer);
}
 
notmuch_message_destroy (message);
diff --git a/notmuch-restore.c b/notmuch-restore.c
index 9ed9b51..77a4c27 100644
--- a/notmuch-restore.c
+++ b/notmuch-restore.c
@@ -207,7 +207,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char 
*argv[])
INTERNAL_ERROR (compile time constant regex failed.);
 
 do {
-   char *query_string;
+   char *query_string, *prefix, *term;
 
if (line_ctx != NULL)
talloc_free (line_ctx);
@@ -220,19 +220,17 @@ notmuch_restore_command (unused (void *ctx), int argc, 
char *argv[])
  query_string, tag_ops);
 
if (ret == 0) {
-   if (strncmp (id:, query_string, 3) != 0) {
-   fprintf (stderr, Warning: unsupported query: %s\n, 
query_string);
+   ret = parse_boolean_term (line_ctx, query_string,
+ prefix, term);
+   if (ret) {
+   fprintf (stderr, Warning: cannot parse query: %s\n,
+query_string);
+   continue;
+   } else if (strcmp (id, prefix) != 0) {
+   fprintf (stderr, Warning: not an id query: %s\n, 
query_string);
continue;
}
-   /* delete id: from front of string; tag_message
-* expects a raw message-id.
-*
-* XXX: Note that query string id:foo and bar will be
-* interpreted as a message id foo and bar. This
-* should eventually be fixed to give a better error
-* message.
-*/
-   query_string = query_string + 3;
+   query_string = term;
}
}
 
diff --git a/tag-util.c b/tag-util.c
index 705b7ba..e4e5dda 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -124,12 +124,6 @@ parse_tag_line (void *ctx, char *line,
 }
 
 /* tok now points to the query string */
-if (hex_decode_inplace (tok) != HEX_SUCCESS) {
-   ret = line_error (TAG_PARSE_INVALID, line_for_error,
- hex decoding of query %s failed, tok);
-   goto DONE;
-}
-
 *query_string = tok;
 
   DONE:
diff --git a/test/dump-restore b/test/dump-restore
index 6a989b6..aecc393 100755
--- a/test/dump-restore
+++ b/test/dump-restore
@@ -195,23 +195,22 @@ a
 
 # the previous line was blank; also no yelling please
 +%zz -- id:whatever
-+e +f id:%yy
++e +f id:
++e +f tag:abc
 # the next non-comment line should report an an empty tag error for
 # batch tagging, but not for restore
 + +e -- id:20091117232137.ga7...@griffis1.net
-# highlight the sketchy id parsing; this should be last
-+g -- id:foo and bar
 EOF
 
 cat EOF  EXPECTED
-Warning: unsupported query: a
+Warning: cannot parse query: a
 Warning: no query string [+0]
 Warning: no query string [+a +b]
 Warning: missing query string [+a +b ]
 Warning: no query string after -- [+c +d --]
 Warning: hex decoding of tag %zz 

[PATCH 1/5] util: Factor out boolean term quoting routine

2012-12-24 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This is now a generic boolean term quoting function.  It performs
minimal quoting to produce user-friendly queries.

This could live in tag-util as well, but it is really nothing specific
to tags (although the conventions are specific to Xapian).

The API is changed from caller-allocates to readline-like.  The
scan for max tag length is pushed down into the quoting routine.
Furthermore, this now combines the term prefix with the quoted term;
arguably this is just as easy to do in the caller, but this will
nicely parallel the boolean term parsing function to be introduced
shortly.

This is an amalgamation of code written by David Bremner and myself.
---
 notmuch-tag.c  |   48 
 util/string-util.c |   62 
 util/string-util.h |9 
 3 files changed, 85 insertions(+), 34 deletions(-)

diff --git a/notmuch-tag.c b/notmuch-tag.c
index 88d559b..fc9d43a 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -19,6 +19,7 @@
  */
 
 #include notmuch-client.h
+#include string-util.h
 
 static volatile sig_atomic_t interrupted;
 
@@ -35,25 +36,6 @@ handle_sigint (unused (int sig))
 interrupted = 1;
 }
 
-static char *
-_escape_tag (char *buf, const char *tag)
-{
-const char *in = tag;
-char *out = buf;
-
-/* Boolean terms surrounded by double quotes can contain any
- * character.  Double quotes are quoted by doubling them. */
-*out++ = '';
-while (*in) {
-   if (*in == '')
-   *out++ = '';
-   *out++ = *in++;
-}
-*out++ = '';
-*out = 0;
-return buf;
-}
-
 typedef struct {
 const char *tag;
 notmuch_bool_t remove;
@@ -71,25 +53,16 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
  * parenthesize and the exclusion part of the query must not use
  * the '-' operator (though the NOT operator is fine). */
 
-char *escaped, *query_string;
+char *escaped = NULL;
+size_t escaped_len = 0;
+char *query_string;
 const char *join = ;
-int i;
-unsigned int max_tag_len = 0;
+size_t i;
 
 /* Don't optimize if there are no tag changes. */
 if (tag_ops[0].tag == NULL)
return talloc_strdup (ctx, orig_query_string);
 
-/* Allocate a buffer for escaping tags.  This is large enough to
- * hold a fully escaped tag with every character doubled plus
- * enclosing quotes and a NUL. */
-for (i = 0; tag_ops[i].tag; i++)
-   if (strlen (tag_ops[i].tag)  max_tag_len)
-   max_tag_len = strlen (tag_ops[i].tag);
-escaped = talloc_array (ctx, char, max_tag_len * 2 + 3);
-if (! escaped)
-   return NULL;
-
 /* Build the new query string */
 if (strcmp (orig_query_string, *) == 0)
query_string = talloc_strdup (ctx, ();
@@ -97,10 +70,17 @@ _optimize_tag_query (void *ctx, const char 
*orig_query_string,
query_string = talloc_asprintf (ctx, ( %s ) and (, orig_query_string);
 
 for (i = 0; tag_ops[i].tag  query_string; i++) {
+   /* XXX in case of OOM, query_string will be deallocated when
+* ctx is, which might be at shutdown */
+   if (make_boolean_term (ctx,
+  tag, tag_ops[i].tag,
+  escaped, escaped_len))
+   return NULL;
+
query_string = talloc_asprintf_append_buffer (
-   query_string, %s%stag:%s, join,
+   query_string, %s%s%s, join,
tag_ops[i].remove ?  : not ,
-   _escape_tag (escaped, tag_ops[i].tag));
+   escaped);
join =  or ;
 }
 
diff --git a/util/string-util.c b/util/string-util.c
index 44f8cd3..161a4dd 100644
--- a/util/string-util.c
+++ b/util/string-util.c
@@ -20,6 +20,7 @@
 
 
 #include string-util.h
+#include talloc.h
 
 char *
 strtok_len (char *s, const char *delim, size_t *len)
@@ -32,3 +33,64 @@ strtok_len (char *s, const char *delim, size_t *len)
 
 return *len ? s : NULL;
 }
+
+int
+make_boolean_term (void *ctx, const char *prefix, const char *term,
+  char **buf, size_t *len)
+{
+const char *in;
+char *out;
+size_t needed = 3;
+int need_quoting = 0;
+
+/* Do we need quoting? */
+for (in = term; *in  !need_quoting; in++)
+   if (*in = ' ' || *in == ')' || *in == '')
+   need_quoting = 1;
+
+if (need_quoting)
+   for (in = term; *in; in++)
+   needed += (*in == '') ? 2 : 1;
+else
+   needed = strlen (term) + 1;
+
+/* Reserve space for the prefix */
+if (prefix)
+   needed += strlen (prefix) + 1;
+
+if ((*buf == NULL) || (needed  *len)) {
+   *len = 2 * needed;
+   *buf = talloc_realloc (ctx, *buf, char, *len);
+}
+
+if (! *buf)
+   return 1;
+
+out = *buf;
+
+/* Copy in the prefix */
+if (prefix) {
+   strcpy (out, prefix);
+   out += strlen (prefix);
+   *out++ = ':';
+}
+
+if (! need_quoting) {
+