[PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients}

2014-10-13 Thread Tomi Ollila
On Mon, Oct 13 2014, Michal Sojka  wrote:

> The new outputs allow printing senders, recipients or both of matching
> messages. The --output option is converted from "keyword" argument to
> "flags" argument, which means that the user can use --output=sender and
> --output=recipients simultaneously, to print both. Other combinations
> produce an error.
>
> This code based on a patch from Jani Nikula.
> ---
>  completion/notmuch-completion.bash |   2 +-
>  completion/notmuch-completion.zsh  |   3 +-
>  doc/man1/notmuch-search.rst|  22 +++-
>  notmuch-search.c   | 110 
> ++---
>  test/T090-search-output.sh |  64 +
>  5 files changed, 189 insertions(+), 12 deletions(-)
>
> diff --git a/completion/notmuch-completion.bash 
> b/completion/notmuch-completion.bash
> index 0571dc9..cfbd389 100644
> --- a/completion/notmuch-completion.bash
> +++ b/completion/notmuch-completion.bash
> @@ -294,7 +294,7 @@ _notmuch_search()
>   return
>   ;;
>   --output)
> - COMPREPLY=( $( compgen -W "summary threads messages files tags" -- 
> "${cur}" ) )
> + COMPREPLY=( $( compgen -W "summary threads messages files tags 
> sender recipients" -- "${cur}" ) )
>   return
>   ;;
>   --sort)
> diff --git a/completion/notmuch-completion.zsh 
> b/completion/notmuch-completion.zsh
> index 67a9aba..3e52a00 100644
> --- a/completion/notmuch-completion.zsh
> +++ b/completion/notmuch-completion.zsh
> @@ -52,7 +52,8 @@ _notmuch_search()
>_arguments -s : \
>  '--max-threads=[display only the first x threads from the search 
> results]:number of threads to show: ' \
>  '--first=[omit the first x threads from the search results]:number of 
> threads to omit: ' \
> -'--sort=[sort results]:sorting:((newest-first\:"reverse chronological 
> order" oldest-first\:"chronological order"))'
> +'--sort=[sort results]:sorting:((newest-first\:"reverse chronological 
> order" oldest-first\:"chronological order"))' \
> +'--output=[select what to output]:output:((summary threads messages 
> files tags sender recipients))'
>  }
>  
>  _notmuch()
> diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
> index 90160f2..c9d38b1 100644
> --- a/doc/man1/notmuch-search.rst
> +++ b/doc/man1/notmuch-search.rst
> @@ -35,7 +35,7 @@ Supported options for **search** include
>  intended for programs that invoke **notmuch(1)** internally. If
>  omitted, the latest supported version will be used.
>  
> -``--output=(summary|threads|messages|files|tags)``
> +``--output=(summary|threads|messages|files|tags|sender|recipients)``
>  
>  **summary**
>  Output a summary of each thread with any message matching
> @@ -78,6 +78,26 @@ Supported options for **search** include
>  by null characters (--format=text0), as a JSON array
>  (--format=json), or as an S-Expression list (--format=sexp).
>  
> + **sender**
> +Output all addresses from the *From* header that appear on
> +any message matching the search terms, either one per line
> +(--format=text), separated by null characters
> +(--format=text0), as a JSON array (--format=json), or as
> +an S-Expression list (--format=sexp).
> +
> + Note: Searching for **sender** should be much faster than
> + searching for **recipients**, because sender addresses are
> + cached directly in the database whereas other addresses
> + need to be fetched from message files.
> +
> + **recipients**
> +Like **sender** but for addresses from *To*, *Cc* and
> + *Bcc* headers.
> +
> + This option can be given multiple times to combine different
> + outputs. Curently, this is only supported for **sender** and
> + **recipients** outputs.
> +
>  ``--sort=``\ (**newest-first**\ \|\ **oldest-first**)
>  This option can be used to present results in either
>  chronological order (**oldest-first**) or reverse chronological
> diff --git a/notmuch-search.c b/notmuch-search.c
> index 5ac2a26..74588f8 100644
> --- a/notmuch-search.c
> +++ b/notmuch-search.c
> @@ -23,11 +23,14 @@
>  #include "string-util.h"
>  
>  typedef enum {
> -OUTPUT_SUMMARY,
> -OUTPUT_THREADS,
> -OUTPUT_MESSAGES,
> -OUTPUT_FILES,
> -OUTPUT_TAGS
> +OUTPUT_SUMMARY   = 1 << 0,
> +OUTPUT_THREADS   = 1 << 1,
> +OUTPUT_MESSAGES  = 1 << 2,
> +OUTPUT_FILES = 1 << 3,
> +OUTPUT_TAGS  = 1 << 4,
> +OUTPUT_SENDER= 1 << 5,
> +OUTPUT_RECIPIENTS= 1 << 6,
> +OUTPUT_ADDRESSES = OUTPUT_SENDER | OUTPUT_RECIPIENTS,

leftover, like mentioned below (this comment added just before sending)

>  } output_t;
>  
>  typedef struct {
> @@ -220,6 +223,67 @@ do_search_threads (search_options_t *o)
>  return 0;
>  }
>  
> +static void
> 

VIM: search_refresh limits message count to 2 * window.height

2014-10-13 Thread Franz Fellner
The issue is that VIM::Buffer.render yield's itself BEFORE it clears
itself.
Two quick solutions:

1) Simply manually fixup the mess in StagedRender::initialize after
@b.render {do_next } by adding
@last_render = @b.count

2) First clear the VIM:Buffer before yielding. This exposes one issue in
Vims buffer handling: A newly created buffer has count==0, But after
the first line got added you cannot get count==0 again, so a refresh
currently ends up with an empty line at the beginning.
It is possible to get the empty line at the end by implementing
VIM::Buffer.<<() as append(count()-1, arg)
Of course one has to add one line now directly after creating a new
buffer.

Solution 1) would be a simple oneliner but IMHO looks a little bit hacky
;)
Solution 2) at first looks ugly because of the empty line at the
end/beginning. But it also adds the opportunity to print additional
information, like description of the columns (date, thread participants,
subject, ...) at the beginning, or something like "end of search list",
"end of thread" at the end of the buffers.

Please tell me which one you like most and I can send a patch.


Regards
Franz

On Fri, 10 Oct 2014 17:56:23 +0200, Franz Fellner  
wrote:
> The reason is that StagedRender.is_ready depends on last_render, which
> get's set to VIM::Buffer.count() in StagedRender::do_next.
> I do not (yet) know what exactly happens, but after the first call to
> search refresh last_render never get's less than 2*2*window.height.
> That means once you do search_refresh StagedRender never will be ready -
> is_ready returns false, so s:show_cursor_moved never will advance the
> StagedRender.
> 
> I am trying to understand the code, but it's a hard time for me ;)


[WIP PATCH 4/4] lib: Add "lastmod:" queries for filtering by last modification

2014-10-13 Thread Austin Clements
From: Austin Clements 

XXX Includes reference to notmuch search --db-revision, which doesn't
exist.
---
 doc/man7/notmuch-search-terms.rst | 8 
 lib/database-private.h| 1 +
 lib/database.cc   | 4 
 3 files changed, 13 insertions(+)

diff --git a/doc/man7/notmuch-search-terms.rst 
b/doc/man7/notmuch-search-terms.rst
index 1acdaa0..df76e39 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -52,6 +52,8 @@ indicate user-supplied values):

 -  date:..

+-  lastmod:..
+
 The **from:** prefix is used to match the name or address of the sender
 of an email message.

@@ -118,6 +120,12 @@ The time range can also be specified using timestamps with 
a syntax of:
 Each timestamp is a number representing the number of seconds since
 1970-01-01 00:00:00 UTC.

+The **lastmod:** prefix can be used to restrict the result by the
+database revision number of when messages were last modified (tags
+were added/removed or filenames changed).  This is usually used in
+conjunction with the **--db-revision** argument to **notmuch search**
+to find messages that have changed since an earlier query.
+
 In addition to individual terms, multiple terms can be combined with
 Boolean operators ( **and**, **or**, **not** , etc.). Each term in the
 query will be implicitly connected by a logical AND if no explicit
diff --git a/lib/database-private.h b/lib/database-private.h
index 0977229..cbca1de 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -163,6 +163,7 @@ struct _notmuch_database {
 Xapian::TermGenerator *term_gen;
 Xapian::ValueRangeProcessor *value_range_processor;
 Xapian::ValueRangeProcessor *date_range_processor;
+Xapian::ValueRangeProcessor *last_mod_range_processor;
 };

 /* Prior to database version 3, features were implied by the database
diff --git a/lib/database.cc b/lib/database.cc
index 9bec170..f9aa45d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -913,6 +913,7 @@ notmuch_database_open (const char *path,
notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor 
(NOTMUCH_VALUE_TIMESTAMP);
notmuch->date_range_processor = new ParseTimeValueRangeProcessor 
(NOTMUCH_VALUE_TIMESTAMP);
+   notmuch->last_mod_range_processor = new 
Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_LAST_MOD, "lastmod:");

notmuch->query_parser->set_default_op (Xapian::Query::OP_AND);
notmuch->query_parser->set_database (*notmuch->xapian_db);
@@ -920,6 +921,7 @@ notmuch_database_open (const char *path,
notmuch->query_parser->set_stemming_strategy 
(Xapian::QueryParser::STEM_SOME);
notmuch->query_parser->add_valuerangeprocessor 
(notmuch->value_range_processor);
notmuch->query_parser->add_valuerangeprocessor 
(notmuch->date_range_processor);
+   notmuch->query_parser->add_valuerangeprocessor 
(notmuch->last_mod_range_processor);

for (i = 0; i < ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) {
prefix_t *prefix = _PREFIX_EXTERNAL[i];
@@ -991,6 +993,8 @@ notmuch_database_close (notmuch_database_t *notmuch)
 notmuch->value_range_processor = NULL;
 delete notmuch->date_range_processor;
 notmuch->date_range_processor = NULL;
+delete notmuch->last_mod_range_processor;
+notmuch->last_mod_range_processor = NULL;

 return status;
 }
-- 
2.1.0



[WIP PATCH 3/4] lib: API to retrieve database revision and UUID

2014-10-13 Thread Austin Clements
This exposes the committed database revision to library users along
with a UUID that can be used to detect when revision numbers are no
longer comparable (e.g., because the database has been replaced).
---
 lib/database-private.h |  1 +
 lib/database.cc| 11 +++
 lib/notmuch.h  | 18 ++
 3 files changed, 30 insertions(+)

diff --git a/lib/database-private.h b/lib/database-private.h
index 465065d..0977229 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -157,6 +157,7 @@ struct _notmuch_database {
  * under a higher revision number, which can be generated with
  * notmuch_database_new_revision. */
 unsigned long revision;
+const char *uuid;

 Xapian::QueryParser *query_parser;
 Xapian::TermGenerator *term_gen;
diff --git a/lib/database.cc b/lib/database.cc
index 45d32ab..9bec170 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -905,6 +905,8 @@ notmuch_database_open (const char *path,
notmuch->revision = 0;
else
notmuch->revision = Xapian::sortable_unserialise (last_mod);
+   notmuch->uuid = talloc_strdup (
+   notmuch, notmuch->xapian_db->get_uuid ().c_str ());

notmuch->query_parser = new Xapian::QueryParser;
notmuch->term_gen = new Xapian::TermGenerator;
@@ -1562,6 +1564,15 @@ DONE:
 return NOTMUCH_STATUS_SUCCESS;
 }

+unsigned long
+notmuch_database_get_revisison (notmuch_database_t *notmuch,
+   const char **uuid)
+{
+if (*uuid)
+   *uuid = notmuch->uuid;
+return notmuch->revision;
+}
+
 /* We allow the user to use arbitrarily long paths for directories. But
  * we have a term-length limit. So if we exceed that, we'll use the
  * SHA-1 of the path for the database term.
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 92594b9..898f7b9 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -433,6 +433,24 @@ notmuch_status_t
 notmuch_database_end_atomic (notmuch_database_t *notmuch);

 /**
+ * Return the committed database revision and UUID.
+ *
+ * The database revision number increases monotonically with each
+ * commit to the database.  Hence, all messages and message changes
+ * committed to the database (that is, visible to readers) have a last
+ * modification revision <= the committed database revision.  Any
+ * messages committed in the future will be assigned a modification
+ * revision > the committed database revision.
+ *
+ * The UUID is a NUL-terminated opaque string that uniquely identifies
+ * this database.  Two revision numbers are only comparable if they
+ * have the same database UUID.
+ */
+unsigned long
+notmuch_database_get_revisison (notmuch_database_t *notmuch,
+   const char **uuid);
+
+/**
  * Retrieve a directory object from the database for 'path'.
  *
  * Here, 'path' should be a path relative to the path of 'database'
-- 
2.1.0



[WIP PATCH 2/4] lib: Add per-message last modification tracking

2014-10-13 Thread Austin Clements
From: Austin Clements 

This adds a new document value that stores the revision of the last
modification to message metadata, where the revision number increases
monotonically with each database commit.

An alternative would be to store the wall-clock time of the last
modification of each message.  In principle this is simpler and has
the advantage that any process can determine the current timestamp
without support from libnotmuch.  However, even assuming a computer's
clock never goes backward and ignoring clock skew in networked
environments, this has a fatal flaw.  Xapian uses (optimistic)
snapshot isolation, which means reads can be concurrent with writes.
Given this, consider the following time line with a write and two read
transactions:

   write  |-X-A--|
   read 1   |---B---|
   read 2  |---|

The write transaction modifies message X and records the wall-clock
time of the modification at A.  The writer hangs around for a while
and later commits its change.  Read 1 is concurrent with the write, so
it doesn't see the change to X.  It does some query and records the
wall-clock time of its results at B.  Transaction read 2 later starts
after the write commits and queries for changes since wall-clock time
B (say the reads are performing an incremental backup).  Even though
read 1 could not see the change to X, read 2 is told (correctly) that
X has not changed since B, the time of the last read.  In fact, X
changed before wall-clock time A, but the change was not visible until
*after* wall-clock time B, so read 2 misses the change to X.

This is tricky to solve in full-blown snapshot isolation, but because
Xapian serializes writes, we can use a simple, monotonically
increasing database revision number.  Furthermore, maintaining this
revision number requires no more IO than a wall-clock time solution
because Xapian already maintains statistics on the upper (and lower)
bound of each value stream.
---
 lib/database-private.h | 15 ++-
 lib/database.cc| 49 +++--
 lib/message.cc | 22 ++
 lib/notmuch-private.h  | 10 +-
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index 15e03cc..465065d 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -92,6 +92,12 @@ enum _notmuch_features {
  *
  * Introduced: version 3. */
 NOTMUCH_FEATURE_GHOSTS = 1 << 4,
+
+/* If set, messages store the revision number of the last
+ * modification in NOTMUCH_VALUE_LAST_MOD.
+ *
+ * Introduced: version 3. */
+NOTMUCH_FEATURE_LAST_MOD = 1 << 5,
 };

 /* In C++, a named enum is its own type, so define bitwise operators
@@ -137,6 +143,8 @@ struct _notmuch_database {

 notmuch_database_mode_t mode;
 int atomic_nesting;
+/* TRUE if changes have been made in this atomic section */
+notmuch_bool_t atomic_dirty;
 Xapian::Database *xapian_db;

 /* Bit mask of features used by this database.  This is a
@@ -145,6 +153,10 @@ struct _notmuch_database {

 unsigned int last_doc_id;
 uint64_t last_thread_id;
+/* Highest committed revision number.  Modifications are recorded
+ * under a higher revision number, which can be generated with
+ * notmuch_database_new_revision. */
+unsigned long revision;

 Xapian::QueryParser *query_parser;
 Xapian::TermGenerator *term_gen;
@@ -166,7 +178,8 @@ struct _notmuch_database {
  * databases will have it). */
 #define NOTMUCH_FEATURES_CURRENT \
 (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
- NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
+ NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS | \
+ NOTMUCH_FEATURE_LAST_MOD)

 /* Return the list of terms from the given iterator matching a prefix.
  * The prefix will be stripped from the strings in the returned list.
diff --git a/lib/database.cc b/lib/database.cc
index 6e51a72..45d32ab 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -101,6 +101,9 @@ typedef struct {
  *
  * SUBJECT:The value of the "Subject" header
  *
+ * LAST_MOD:   The revision number as of the last tag or
+ * filename change.
+ *
  * In addition, terms from the content of the message are added with
  * "from", "to", "attachment", and "subject" prefixes for use by the
  * user in searching. Similarly, terms from the path of the mail
@@ -304,6 +307,8 @@ static const struct {
   "exact folder:/path: search", "rw" },
 { NOTMUCH_FEATURE_GHOSTS,
   "mail documents for missing messages", "w"},
+{ NOTMUCH_FEATURE_LAST_MOD,
+  "modification tracking", "w"},
 };

 const char *
@@ -678,6 +683,23 @@ _notmuch_database_ensure_writable (notmuch_database_t 
*notmuch)
 return NOTMUCH_STATUS_SUCCESS;
 }

+/* Allocate a revision number for the next change. */
+unsigned long

[WIP PATCH 1/4] lib: Only sync modified message documents

2014-10-13 Thread Austin Clements
From: Austin Clements 

Previously, we updated the database copy of a message on every call to
_notmuch_message_sync, even if nothing had changed.  In particular,
this always happens on a thaw, so a freeze/thaw pair with no
modifications between still caused a database update.

We only modify message documents in a handful of places, so keep track
of whether the document has been modified and only sync it when
necessary.  This will be particularly important when we add message
revision tracking.
---
 lib/message.cc | 12 
 1 file changed, 12 insertions(+)

diff --git a/lib/message.cc b/lib/message.cc
index a7a13cc..cf2fd7c 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -43,6 +43,9 @@ struct visible _notmuch_message {
  * if each flag has been initialized. */
 unsigned long lazy_flags;

+/* Message document modified since last sync */
+notmuch_bool_t modified;
+
 Xapian::Document doc;
 Xapian::termcount termpos;
 };
@@ -538,6 +541,7 @@ _notmuch_message_remove_terms (notmuch_message_t *message, 
const char *prefix)

try {
message->doc.remove_term ((*i));
+   message->modified = TRUE;
} catch (const Xapian::InvalidArgumentError) {
/* Ignore failure to remove non-existent term. */
}
@@ -791,6 +795,7 @@ void
 _notmuch_message_clear_data (notmuch_message_t *message)
 {
 message->doc.set_data ("");
+message->modified = TRUE;
 }

 static void
@@ -988,6 +993,7 @@ _notmuch_message_set_header_values (notmuch_message_t 
*message,
Xapian::sortable_serialise (time_value));
 message->doc.add_value (NOTMUCH_VALUE_FROM, from);
 message->doc.add_value (NOTMUCH_VALUE_SUBJECT, subject);
+message->modified = TRUE;
 }

 /* Synchronize changes made to message->doc out into the database. */
@@ -999,8 +1005,12 @@ _notmuch_message_sync (notmuch_message_t *message)
 if (message->notmuch->mode == NOTMUCH_DATABASE_MODE_READ_ONLY)
return;

+if (! message->modified)
+   return;
+
 db = static_cast  
(message->notmuch->xapian_db);
 db->replace_document (message->doc_id, message->doc);
+message->modified = FALSE;
 }

 /* Delete a message document from the database. */
@@ -1075,6 +1085,7 @@ _notmuch_message_add_term (notmuch_message_t *message,
return NOTMUCH_PRIVATE_STATUS_TERM_TOO_LONG;

 message->doc.add_term (term, 0);
+message->modified = TRUE;

 talloc_free (term);

@@ -1143,6 +1154,7 @@ _notmuch_message_remove_term (notmuch_message_t *message,

 try {
message->doc.remove_term (term);
+   message->modified = TRUE;
 } catch (const Xapian::InvalidArgumentError) {
/* We'll let the philosopher's try to wrestle with the
 * question of whether failing to remove that which was not
-- 
2.1.0



[WIP PATCH 0/4] Add message revision tracking

2014-10-13 Thread Austin Clements
This implements message revision tracking.  This is definitely a
work-in-progress, but I wanted to post it since I don't know when I'll
be able to work on it next (and maybe someone else can run with it in
the mean time).  I think this makes all of the necessary library-side
changes, but doesn't do anything in the CLI to expose current revision
information other than adding support for a "lastmod" query.

This series applies on top of the ghost message series, but only
because of trivial conflicts in the lists of features in database.cc
and database-private.h.  There's no code dependency.



[PATCH v3 4/4] cli: Add an option to filter our duplicate addresses

2014-10-13 Thread Michal Sojka
This adds a --filter-by option to "notmuch search". It can be used to
filter out duplicate addresses in --output=sender/receivers.

The code here is an extended version of a patch from Jani Nikula.
---
 completion/notmuch-completion.bash |  6 ++-
 completion/notmuch-completion.zsh  |  3 +-
 doc/man1/notmuch-search.rst| 32 +
 notmuch-search.c   | 93 +++---
 test/T095-search-filter-by.sh  | 55 ++
 5 files changed, 181 insertions(+), 8 deletions(-)
 create mode 100755 test/T095-search-filter-by.sh

diff --git a/completion/notmuch-completion.bash 
b/completion/notmuch-completion.bash
index cfbd389..41dd85b 100644
--- a/completion/notmuch-completion.bash
+++ b/completion/notmuch-completion.bash
@@ -305,12 +305,16 @@ _notmuch_search()
COMPREPLY=( $( compgen -W "true false flag all" -- "${cur}" ) )
return
;;
+   --filter-by)
+   COMPREPLY=( $( compgen -W "addr addrfold name" -- "${cur}" ) )
+   return
+   ;;
 esac

 ! $split &&
 case "${cur}" in
-*)
-   local options="--format= --output= --sort= --offset= --limit= 
--exclude= --duplicate="
+   local options="--format= --output= --sort= --offset= --limit= 
--exclude= --duplicate= --filter-by="
compopt -o nospace
COMPREPLY=( $(compgen -W "$options" -- ${cur}) )
;;
diff --git a/completion/notmuch-completion.zsh 
b/completion/notmuch-completion.zsh
index 3e52a00..17b345f 100644
--- a/completion/notmuch-completion.zsh
+++ b/completion/notmuch-completion.zsh
@@ -53,7 +53,8 @@ _notmuch_search()
 '--max-threads=[display only the first x threads from the search 
results]:number of threads to show: ' \
 '--first=[omit the first x threads from the search results]:number of 
threads to omit: ' \
 '--sort=[sort results]:sorting:((newest-first\:"reverse chronological 
order" oldest-first\:"chronological order"))' \
-'--output=[select what to output]:output:((summary threads messages files 
tags sender recipients))'
+'--output=[select what to output]:output:((summary threads messages files 
tags sender recipients))' \
+'--filter-by=[filter out duplicate addresses]:filter-by:((addr\:"address 
part" addrfold\:"case-insensitive address part" name\:"name part"))'
 }

 _notmuch()
diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
index c9d38b1..0fed76e 100644
--- a/doc/man1/notmuch-search.rst
+++ b/doc/man1/notmuch-search.rst
@@ -85,6 +85,9 @@ Supported options for **search** include
 (--format=text0), as a JSON array (--format=json), or as
 an S-Expression list (--format=sexp).

+Handling of duplicate addresses and/or names can be
+controlled with the --filter-by option.
+
Note: Searching for **sender** should be much faster than
searching for **recipients**, because sender addresses are
cached directly in the database whereas other addresses
@@ -151,6 +154,35 @@ Supported options for **search** include
 prefix. The prefix matches messages based on filenames. This
 option filters filenames of the matching messages.

+``--filter-by=``\ (**addr**\ \|\ **addrfold**\ \|\ **name**)
+
+   Can be used with ``--output=sender`` or
+   ``--output=recipients`` to filter out duplicate addresses. The
+   filtering algorithm receives a sequence of email addresses and
+   outputs the same sequence without the addresses that are
+   considered a duplicate of a previously output address. What is
+   considered a duplicate depends on how two addresses are
+   compared and this can be controlled by the follwing flags:
+
+   **addr** means that the address part is compared in
+   case-sensitive manner. For example, the addresses "John Doe
+   " and "Dr. John Doe " will
+   be considered duplicate.
+
+   **addrfold** is similar to **addr**, but in addition to it
+   case folding is performed before comparison. For example, the
+   addresses "John Doe " and "Dr. John Doe
+   " will be considered duplicate.
+
+   **name** means that the name part is compared in case-sensitive
+   manner. For example, the addresses "John Doe "
+   and "John Doe " will be considered duplicate.
+
+   This option can be given multiple times to combine the effects
+   of the flags. For example,
+   ``--filter-by=name --filter-by=addr`` will print unique
+   case-sensitive combinations of both name and address parts.
+
 EXIT STATUS
 ===

diff --git a/notmuch-search.c b/notmuch-search.c
index 74588f8..df678ad 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -33,6 +33,12 @@ typedef enum {
 OUTPUT_ADDRESSES   = OUTPUT_SENDER | OUTPUT_RECIPIENTS,
 } output_t;

+typedef enum {
+FILTER_FLAG_ADDR  = 1 << 0,
+FILTER_FLAG_NAME  = 1 << 1,
+

[PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients}

2014-10-13 Thread Michal Sojka
The new outputs allow printing senders, recipients or both of matching
messages. The --output option is converted from "keyword" argument to
"flags" argument, which means that the user can use --output=sender and
--output=recipients simultaneously, to print both. Other combinations
produce an error.

This code based on a patch from Jani Nikula.
---
 completion/notmuch-completion.bash |   2 +-
 completion/notmuch-completion.zsh  |   3 +-
 doc/man1/notmuch-search.rst|  22 +++-
 notmuch-search.c   | 110 ++---
 test/T090-search-output.sh |  64 +
 5 files changed, 189 insertions(+), 12 deletions(-)

diff --git a/completion/notmuch-completion.bash 
b/completion/notmuch-completion.bash
index 0571dc9..cfbd389 100644
--- a/completion/notmuch-completion.bash
+++ b/completion/notmuch-completion.bash
@@ -294,7 +294,7 @@ _notmuch_search()
return
;;
--output)
-   COMPREPLY=( $( compgen -W "summary threads messages files tags" -- 
"${cur}" ) )
+   COMPREPLY=( $( compgen -W "summary threads messages files tags 
sender recipients" -- "${cur}" ) )
return
;;
--sort)
diff --git a/completion/notmuch-completion.zsh 
b/completion/notmuch-completion.zsh
index 67a9aba..3e52a00 100644
--- a/completion/notmuch-completion.zsh
+++ b/completion/notmuch-completion.zsh
@@ -52,7 +52,8 @@ _notmuch_search()
   _arguments -s : \
 '--max-threads=[display only the first x threads from the search 
results]:number of threads to show: ' \
 '--first=[omit the first x threads from the search results]:number of 
threads to omit: ' \
-'--sort=[sort results]:sorting:((newest-first\:"reverse chronological 
order" oldest-first\:"chronological order"))'
+'--sort=[sort results]:sorting:((newest-first\:"reverse chronological 
order" oldest-first\:"chronological order"))' \
+'--output=[select what to output]:output:((summary threads messages files 
tags sender recipients))'
 }

 _notmuch()
diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
index 90160f2..c9d38b1 100644
--- a/doc/man1/notmuch-search.rst
+++ b/doc/man1/notmuch-search.rst
@@ -35,7 +35,7 @@ Supported options for **search** include
 intended for programs that invoke **notmuch(1)** internally. If
 omitted, the latest supported version will be used.

-``--output=(summary|threads|messages|files|tags)``
+``--output=(summary|threads|messages|files|tags|sender|recipients)``

 **summary**
 Output a summary of each thread with any message matching
@@ -78,6 +78,26 @@ Supported options for **search** include
 by null characters (--format=text0), as a JSON array
 (--format=json), or as an S-Expression list (--format=sexp).

+   **sender**
+Output all addresses from the *From* header that appear on
+any message matching the search terms, either one per line
+(--format=text), separated by null characters
+(--format=text0), as a JSON array (--format=json), or as
+an S-Expression list (--format=sexp).
+
+   Note: Searching for **sender** should be much faster than
+   searching for **recipients**, because sender addresses are
+   cached directly in the database whereas other addresses
+   need to be fetched from message files.
+
+   **recipients**
+Like **sender** but for addresses from *To*, *Cc* and
+   *Bcc* headers.
+
+   This option can be given multiple times to combine different
+   outputs. Curently, this is only supported for **sender** and
+   **recipients** outputs.
+
 ``--sort=``\ (**newest-first**\ \|\ **oldest-first**)
 This option can be used to present results in either
 chronological order (**oldest-first**) or reverse chronological
diff --git a/notmuch-search.c b/notmuch-search.c
index 5ac2a26..74588f8 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -23,11 +23,14 @@
 #include "string-util.h"

 typedef enum {
-OUTPUT_SUMMARY,
-OUTPUT_THREADS,
-OUTPUT_MESSAGES,
-OUTPUT_FILES,
-OUTPUT_TAGS
+OUTPUT_SUMMARY = 1 << 0,
+OUTPUT_THREADS = 1 << 1,
+OUTPUT_MESSAGES= 1 << 2,
+OUTPUT_FILES   = 1 << 3,
+OUTPUT_TAGS= 1 << 4,
+OUTPUT_SENDER  = 1 << 5,
+OUTPUT_RECIPIENTS  = 1 << 6,
+OUTPUT_ADDRESSES   = OUTPUT_SENDER | OUTPUT_RECIPIENTS,
 } output_t;

 typedef struct {
@@ -220,6 +223,67 @@ do_search_threads (search_options_t *o)
 return 0;
 }

+static void
+print_address_list (const search_options_t *o, InternetAddressList *list)
+{
+InternetAddress *address;
+int i;
+
+for (i = 0; i < internet_address_list_length (list); i++) {
+   address = internet_address_list_get_address (list, i);
+   if (INTERNET_ADDRESS_IS_GROUP (address)) {
+   InternetAddressGroup 

[PATCH v3 2/4] cli: Add support for parsing multiple keyword arguments

2014-10-13 Thread Michal Sojka
From: Jani Nikula 

This allows having multiple --foo=bar --foo=baz options on the command
line, with the corresponding values OR'd together.

[Test added by Michal Sojka]
---
 command-line-arguments.c  | 6 +-
 command-line-arguments.h  | 1 +
 test/T410-argument-parsing.sh | 3 ++-
 test/arg-test.c   | 9 +
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/command-line-arguments.c b/command-line-arguments.c
index 844d6c3..c6f7269 100644
--- a/command-line-arguments.c
+++ b/command-line-arguments.c
@@ -23,7 +23,10 @@ _process_keyword_arg (const notmuch_opt_desc_t *arg_desc, 
char next, const char
 while (keywords->name) {
if (strcmp (arg_str, keywords->name) == 0) {
if (arg_desc->output_var) {
-   *((int *)arg_desc->output_var) = keywords->value;
+   if (arg_desc->opt_type == NOTMUCH_OPT_KEYWORD_FLAGS)
+   *((int *)arg_desc->output_var) |= keywords->value;
+   else
+   *((int *)arg_desc->output_var) = keywords->value;
}
return TRUE;
}
@@ -152,6 +155,7 @@ parse_option (const char *arg,

switch (try->opt_type) {
case NOTMUCH_OPT_KEYWORD:
+   case NOTMUCH_OPT_KEYWORD_FLAGS:
return _process_keyword_arg (try, next, value);
case NOTMUCH_OPT_BOOLEAN:
return _process_boolean_arg (try, next, value);
diff --git a/command-line-arguments.h b/command-line-arguments.h
index de1734a..085a492 100644
--- a/command-line-arguments.h
+++ b/command-line-arguments.h
@@ -8,6 +8,7 @@ enum notmuch_opt_type {
 NOTMUCH_OPT_BOOLEAN,   /* --verbose  */
 NOTMUCH_OPT_INT,   /* --frob=8   */
 NOTMUCH_OPT_KEYWORD,   /* --format=raw|json|text */
+NOTMUCH_OPT_KEYWORD_FLAGS,  /* the above with values OR'd together */
 NOTMUCH_OPT_STRING,/* --file=/tmp/gnarf.txt  */
 NOTMUCH_OPT_POSITION   /* notmuch dump pos_arg   */
 };
diff --git a/test/T410-argument-parsing.sh b/test/T410-argument-parsing.sh
index 94e9087..2e5d7ae 100755
--- a/test/T410-argument-parsing.sh
+++ b/test/T410-argument-parsing.sh
@@ -3,9 +3,10 @@ test_description="argument parsing"
 . ./test-lib.sh

 test_begin_subtest "sanity check"
-$TEST_DIRECTORY/arg-test  pos1  --keyword=one --string=foo pos2 --int=7 > 
OUTPUT
+$TEST_DIRECTORY/arg-test  pos1  --keyword=one --string=foo pos2 --int=7 
--flag=one --flag=three > OUTPUT
 cat < EXPECTED
 keyword 1
+flags 5
 int 7
 string foo
 positional arg 1 pos1
diff --git a/test/arg-test.c b/test/arg-test.c
index 6c49eac..736686d 100644
--- a/test/arg-test.c
+++ b/test/arg-test.c
@@ -7,6 +7,7 @@ int main(int argc, char **argv){
 int opt_index=1;

 int kw_val=0;
+int fl_val=0;
 int int_val=0;
 char *pos_arg1=NULL;
 char *pos_arg2=NULL;
@@ -17,6 +18,11 @@ int main(int argc, char **argv){
  (notmuch_keyword_t []){ { "one", 1 },
  { "two", 2 },
  { 0, 0 } } },
+   { NOTMUCH_OPT_KEYWORD_FLAGS, _val, "flag", 'f',
+ (notmuch_keyword_t []){ { "one",   1 << 0},
+ { "two",   1 << 1 },
+ { "three", 1 << 2 },
+ { 0, 0 } } },
{ NOTMUCH_OPT_INT, _val, "int", 'i', 0},
{ NOTMUCH_OPT_STRING, _val, "string", 's', 0},
{ NOTMUCH_OPT_POSITION, _arg1, 0,0, 0},
@@ -31,6 +37,9 @@ int main(int argc, char **argv){
 if (kw_val)
printf("keyword %d\n", kw_val);

+if (fl_val)
+   printf("flags %d\n", fl_val);
+
 if (int_val)
printf("int %d\n", int_val);

-- 
2.1.1



[PATCH v3 1/4] cli: Refactor option passing in the search command

2014-10-13 Thread Michal Sojka
Many functions that implement the search command need to access command
line options. Instead of passing each option in a separate variable, put
them in a structure and pass only this structure.

This will become handy in the following patches.
---
 notmuch-search.c | 122 ---
 1 file changed, 62 insertions(+), 60 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index bc9be45..5ac2a26 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -30,6 +30,16 @@ typedef enum {
 OUTPUT_TAGS
 } output_t;

+typedef struct {
+sprinter_t *format;
+notmuch_query_t *query;
+notmuch_sort_t sort;
+output_t output;
+int offset;
+int limit;
+int dupe;
+} search_options_t;
+
 /* Return two stable query strings that identify exactly the matched
  * and unmatched messages currently in thread.  If there are no
  * matched or unmatched messages, the returned buffers will be
@@ -70,46 +80,42 @@ get_thread_query (notmuch_thread_t *thread,
 }

 static int
-do_search_threads (sprinter_t *format,
-  notmuch_query_t *query,
-  notmuch_sort_t sort,
-  output_t output,
-  int offset,
-  int limit)
+do_search_threads (search_options_t *o)
 {
 notmuch_thread_t *thread;
 notmuch_threads_t *threads;
 notmuch_tags_t *tags;
+sprinter_t *format = o->format;
 time_t date;
 int i;

-if (offset < 0) {
-   offset += notmuch_query_count_threads (query);
-   if (offset < 0)
-   offset = 0;
+if (o->offset < 0) {
+   o->offset += notmuch_query_count_threads (o->query);
+   if (o->offset < 0)
+   o->offset = 0;
 }

-threads = notmuch_query_search_threads (query);
+threads = notmuch_query_search_threads (o->query);
 if (threads == NULL)
return 1;

 format->begin_list (format);

 for (i = 0;
-notmuch_threads_valid (threads) && (limit < 0 || i < offset + limit);
+notmuch_threads_valid (threads) && (o->limit < 0 || i < o->offset + 
o->limit);
 notmuch_threads_move_to_next (threads), i++)
 {
thread = notmuch_threads_get (threads);

-   if (i < offset) {
+   if (i < o->offset) {
notmuch_thread_destroy (thread);
continue;
}

-   if (output == OUTPUT_THREADS) {
+   if (o->output == OUTPUT_THREADS) {
format->set_prefix (format, "thread");
format->string (format,
-   notmuch_thread_get_thread_id (thread));
+  notmuch_thread_get_thread_id (thread));
format->separator (format);
} else { /* output == OUTPUT_SUMMARY */
void *ctx_quote = talloc_new (thread);
@@ -123,7 +129,7 @@ do_search_threads (sprinter_t *format,

format->begin_map (format);

-   if (sort == NOTMUCH_SORT_OLDEST_FIRST)
+   if (o->sort == NOTMUCH_SORT_OLDEST_FIRST)
date = notmuch_thread_get_oldest_date (thread);
else
date = notmuch_thread_get_newest_date (thread);
@@ -215,40 +221,36 @@ do_search_threads (sprinter_t *format,
 }

 static int
-do_search_messages (sprinter_t *format,
-   notmuch_query_t *query,
-   output_t output,
-   int offset,
-   int limit,
-   int dupe)
+do_search_messages (search_options_t *o)
 {
 notmuch_message_t *message;
 notmuch_messages_t *messages;
 notmuch_filenames_t *filenames;
+sprinter_t *format = o->format;
 int i;

-if (offset < 0) {
-   offset += notmuch_query_count_messages (query);
-   if (offset < 0)
-   offset = 0;
+if (o->offset < 0) {
+   o->offset += notmuch_query_count_messages (o->query);
+   if (o->offset < 0)
+   o->offset = 0;
 }

-messages = notmuch_query_search_messages (query);
+messages = notmuch_query_search_messages (o->query);
 if (messages == NULL)
return 1;

 format->begin_list (format);

 for (i = 0;
-notmuch_messages_valid (messages) && (limit < 0 || i < offset + limit);
+notmuch_messages_valid (messages) && (o->limit < 0 || i < o->offset + 
o->limit);
 notmuch_messages_move_to_next (messages), i++)
 {
-   if (i < offset)
+   if (i < o->offset)
continue;

message = notmuch_messages_get (messages);

-   if (output == OUTPUT_FILES) {
+   if (o->output == OUTPUT_FILES) {
int j;
filenames = notmuch_message_get_filenames (message);

@@ -256,7 +258,7 @@ do_search_messages (sprinter_t *format,
 notmuch_filenames_valid (filenames);
 notmuch_filenames_move_to_next (filenames), j++)
{
-   if (dupe < 0 || dupe == j) {
+   if (o->dupe < 0 || o->dupe == j) {
format->string (format, 

[PATCH v3 0/4] notmuch search --output=sender/recipients

2014-10-13 Thread Michal Sojka
Hi,

this is a third version of my adaptation of Jani's patch series adding
--output=sender/recipients and related arguments to notmuch search.

The 1st patch is the same as in v2 (Marked as OK in
id:m24mvht4c4.fsf at guru.guru-group.fi).

The 2nd patch is not changed as well, but in v2 it was patch 3/4.

The 3rd patch is rewritten to use the "keyword flags" introduced in
patch 2 (requested by Tomi). The code is basically the same as in
id:1410021689-15901-1-git-send-email-jani at nikula.org, but tests are
added and shell completion is updated.

Finally, last patch adds --filter-by option that allows one to filter
out duplicate addresses. This option was called --unique in v2 and the
the semantic is slightly different now. This resulted in simpler code.
The documentation was also reworked and is hopefully more
understandable now.

-Michal


Jani Nikula (1):
  cli: Add support for parsing multiple keyword arguments

Michal Sojka (3):
  cli: Refactor option passing in the search command
  cli: Extend the search command for --output={sender,recipients}
  cli: Add an option to filter our duplicate addresses

 command-line-arguments.c   |   6 +-
 command-line-arguments.h   |   1 +
 completion/notmuch-completion.bash |   8 +-
 completion/notmuch-completion.zsh  |   4 +-
 doc/man1/notmuch-search.rst|  54 ++-
 notmuch-search.c   | 309 +
 test/T090-search-output.sh |  64 
 test/T095-search-filter-by.sh  |  55 +++
 test/T410-argument-parsing.sh  |   3 +-
 test/arg-test.c|   9 ++
 10 files changed, 440 insertions(+), 73 deletions(-)
 create mode 100755 test/T095-search-filter-by.sh

-- 
2.1.1



[WIP PATCH 0/4] Add message revision tracking

2014-10-13 Thread Austin Clements
This implements message revision tracking.  This is definitely a
work-in-progress, but I wanted to post it since I don't know when I'll
be able to work on it next (and maybe someone else can run with it in
the mean time).  I think this makes all of the necessary library-side
changes, but doesn't do anything in the CLI to expose current revision
information other than adding support for a lastmod query.

This series applies on top of the ghost message series, but only
because of trivial conflicts in the lists of features in database.cc
and database-private.h.  There's no code dependency.

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[WIP PATCH 3/4] lib: API to retrieve database revision and UUID

2014-10-13 Thread Austin Clements
This exposes the committed database revision to library users along
with a UUID that can be used to detect when revision numbers are no
longer comparable (e.g., because the database has been replaced).
---
 lib/database-private.h |  1 +
 lib/database.cc| 11 +++
 lib/notmuch.h  | 18 ++
 3 files changed, 30 insertions(+)

diff --git a/lib/database-private.h b/lib/database-private.h
index 465065d..0977229 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -157,6 +157,7 @@ struct _notmuch_database {
  * under a higher revision number, which can be generated with
  * notmuch_database_new_revision. */
 unsigned long revision;
+const char *uuid;
 
 Xapian::QueryParser *query_parser;
 Xapian::TermGenerator *term_gen;
diff --git a/lib/database.cc b/lib/database.cc
index 45d32ab..9bec170 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -905,6 +905,8 @@ notmuch_database_open (const char *path,
notmuch-revision = 0;
else
notmuch-revision = Xapian::sortable_unserialise (last_mod);
+   notmuch-uuid = talloc_strdup (
+   notmuch, notmuch-xapian_db-get_uuid ().c_str ());
 
notmuch-query_parser = new Xapian::QueryParser;
notmuch-term_gen = new Xapian::TermGenerator;
@@ -1562,6 +1564,15 @@ DONE:
 return NOTMUCH_STATUS_SUCCESS;
 }
 
+unsigned long
+notmuch_database_get_revisison (notmuch_database_t *notmuch,
+   const char **uuid)
+{
+if (*uuid)
+   *uuid = notmuch-uuid;
+return notmuch-revision;
+}
+
 /* We allow the user to use arbitrarily long paths for directories. But
  * we have a term-length limit. So if we exceed that, we'll use the
  * SHA-1 of the path for the database term.
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 92594b9..898f7b9 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -433,6 +433,24 @@ notmuch_status_t
 notmuch_database_end_atomic (notmuch_database_t *notmuch);
 
 /**
+ * Return the committed database revision and UUID.
+ *
+ * The database revision number increases monotonically with each
+ * commit to the database.  Hence, all messages and message changes
+ * committed to the database (that is, visible to readers) have a last
+ * modification revision = the committed database revision.  Any
+ * messages committed in the future will be assigned a modification
+ * revision  the committed database revision.
+ *
+ * The UUID is a NUL-terminated opaque string that uniquely identifies
+ * this database.  Two revision numbers are only comparable if they
+ * have the same database UUID.
+ */
+unsigned long
+notmuch_database_get_revisison (notmuch_database_t *notmuch,
+   const char **uuid);
+
+/**
  * Retrieve a directory object from the database for 'path'.
  *
  * Here, 'path' should be a path relative to the path of 'database'
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[WIP PATCH 4/4] lib: Add lastmod: queries for filtering by last modification

2014-10-13 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

XXX Includes reference to notmuch search --db-revision, which doesn't
exist.
---
 doc/man7/notmuch-search-terms.rst | 8 
 lib/database-private.h| 1 +
 lib/database.cc   | 4 
 3 files changed, 13 insertions(+)

diff --git a/doc/man7/notmuch-search-terms.rst 
b/doc/man7/notmuch-search-terms.rst
index 1acdaa0..df76e39 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -52,6 +52,8 @@ indicate user-supplied values):
 
 -  date:since..until
 
+-  lastmod:since..until
+
 The **from:** prefix is used to match the name or address of the sender
 of an email message.
 
@@ -118,6 +120,12 @@ The time range can also be specified using timestamps with 
a syntax of:
 Each timestamp is a number representing the number of seconds since
 1970-01-01 00:00:00 UTC.
 
+The **lastmod:** prefix can be used to restrict the result by the
+database revision number of when messages were last modified (tags
+were added/removed or filenames changed).  This is usually used in
+conjunction with the **--db-revision** argument to **notmuch search**
+to find messages that have changed since an earlier query.
+
 In addition to individual terms, multiple terms can be combined with
 Boolean operators ( **and**, **or**, **not** , etc.). Each term in the
 query will be implicitly connected by a logical AND if no explicit
diff --git a/lib/database-private.h b/lib/database-private.h
index 0977229..cbca1de 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -163,6 +163,7 @@ struct _notmuch_database {
 Xapian::TermGenerator *term_gen;
 Xapian::ValueRangeProcessor *value_range_processor;
 Xapian::ValueRangeProcessor *date_range_processor;
+Xapian::ValueRangeProcessor *last_mod_range_processor;
 };
 
 /* Prior to database version 3, features were implied by the database
diff --git a/lib/database.cc b/lib/database.cc
index 9bec170..f9aa45d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -913,6 +913,7 @@ notmuch_database_open (const char *path,
notmuch-term_gen-set_stemmer (Xapian::Stem (english));
notmuch-value_range_processor = new Xapian::NumberValueRangeProcessor 
(NOTMUCH_VALUE_TIMESTAMP);
notmuch-date_range_processor = new ParseTimeValueRangeProcessor 
(NOTMUCH_VALUE_TIMESTAMP);
+   notmuch-last_mod_range_processor = new 
Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_LAST_MOD, lastmod:);
 
notmuch-query_parser-set_default_op (Xapian::Query::OP_AND);
notmuch-query_parser-set_database (*notmuch-xapian_db);
@@ -920,6 +921,7 @@ notmuch_database_open (const char *path,
notmuch-query_parser-set_stemming_strategy 
(Xapian::QueryParser::STEM_SOME);
notmuch-query_parser-add_valuerangeprocessor 
(notmuch-value_range_processor);
notmuch-query_parser-add_valuerangeprocessor 
(notmuch-date_range_processor);
+   notmuch-query_parser-add_valuerangeprocessor 
(notmuch-last_mod_range_processor);
 
for (i = 0; i  ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) {
prefix_t *prefix = BOOLEAN_PREFIX_EXTERNAL[i];
@@ -991,6 +993,8 @@ notmuch_database_close (notmuch_database_t *notmuch)
 notmuch-value_range_processor = NULL;
 delete notmuch-date_range_processor;
 notmuch-date_range_processor = NULL;
+delete notmuch-last_mod_range_processor;
+notmuch-last_mod_range_processor = NULL;
 
 return status;
 }
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[WIP PATCH 2/4] lib: Add per-message last modification tracking

2014-10-13 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This adds a new document value that stores the revision of the last
modification to message metadata, where the revision number increases
monotonically with each database commit.

An alternative would be to store the wall-clock time of the last
modification of each message.  In principle this is simpler and has
the advantage that any process can determine the current timestamp
without support from libnotmuch.  However, even assuming a computer's
clock never goes backward and ignoring clock skew in networked
environments, this has a fatal flaw.  Xapian uses (optimistic)
snapshot isolation, which means reads can be concurrent with writes.
Given this, consider the following time line with a write and two read
transactions:

   write  |-X-A--|
   read 1   |---B---|
   read 2  |---|

The write transaction modifies message X and records the wall-clock
time of the modification at A.  The writer hangs around for a while
and later commits its change.  Read 1 is concurrent with the write, so
it doesn't see the change to X.  It does some query and records the
wall-clock time of its results at B.  Transaction read 2 later starts
after the write commits and queries for changes since wall-clock time
B (say the reads are performing an incremental backup).  Even though
read 1 could not see the change to X, read 2 is told (correctly) that
X has not changed since B, the time of the last read.  In fact, X
changed before wall-clock time A, but the change was not visible until
*after* wall-clock time B, so read 2 misses the change to X.

This is tricky to solve in full-blown snapshot isolation, but because
Xapian serializes writes, we can use a simple, monotonically
increasing database revision number.  Furthermore, maintaining this
revision number requires no more IO than a wall-clock time solution
because Xapian already maintains statistics on the upper (and lower)
bound of each value stream.
---
 lib/database-private.h | 15 ++-
 lib/database.cc| 49 +++--
 lib/message.cc | 22 ++
 lib/notmuch-private.h  | 10 +-
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index 15e03cc..465065d 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -92,6 +92,12 @@ enum _notmuch_features {
  *
  * Introduced: version 3. */
 NOTMUCH_FEATURE_GHOSTS = 1  4,
+
+/* If set, messages store the revision number of the last
+ * modification in NOTMUCH_VALUE_LAST_MOD.
+ *
+ * Introduced: version 3. */
+NOTMUCH_FEATURE_LAST_MOD = 1  5,
 };
 
 /* In C++, a named enum is its own type, so define bitwise operators
@@ -137,6 +143,8 @@ struct _notmuch_database {
 
 notmuch_database_mode_t mode;
 int atomic_nesting;
+/* TRUE if changes have been made in this atomic section */
+notmuch_bool_t atomic_dirty;
 Xapian::Database *xapian_db;
 
 /* Bit mask of features used by this database.  This is a
@@ -145,6 +153,10 @@ struct _notmuch_database {
 
 unsigned int last_doc_id;
 uint64_t last_thread_id;
+/* Highest committed revision number.  Modifications are recorded
+ * under a higher revision number, which can be generated with
+ * notmuch_database_new_revision. */
+unsigned long revision;
 
 Xapian::QueryParser *query_parser;
 Xapian::TermGenerator *term_gen;
@@ -166,7 +178,8 @@ struct _notmuch_database {
  * databases will have it). */
 #define NOTMUCH_FEATURES_CURRENT \
 (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
- NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
+ NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS | \
+ NOTMUCH_FEATURE_LAST_MOD)
 
 /* Return the list of terms from the given iterator matching a prefix.
  * The prefix will be stripped from the strings in the returned list.
diff --git a/lib/database.cc b/lib/database.cc
index 6e51a72..45d32ab 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -101,6 +101,9 @@ typedef struct {
  *
  * SUBJECT:The value of the Subject header
  *
+ * LAST_MOD:   The revision number as of the last tag or
+ * filename change.
+ *
  * In addition, terms from the content of the message are added with
  * from, to, attachment, and subject prefixes for use by the
  * user in searching. Similarly, terms from the path of the mail
@@ -304,6 +307,8 @@ static const struct {
   exact folder:/path: search, rw },
 { NOTMUCH_FEATURE_GHOSTS,
   mail documents for missing messages, w},
+{ NOTMUCH_FEATURE_LAST_MOD,
+  modification tracking, w},
 };
 
 const char *
@@ -678,6 +683,23 @@ _notmuch_database_ensure_writable (notmuch_database_t 
*notmuch)
 return NOTMUCH_STATUS_SUCCESS;
 }
 
+/* Allocate a revision number for the next change. */
+unsigned long
+_notmuch_database_new_revision 

[WIP PATCH 1/4] lib: Only sync modified message documents

2014-10-13 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

Previously, we updated the database copy of a message on every call to
_notmuch_message_sync, even if nothing had changed.  In particular,
this always happens on a thaw, so a freeze/thaw pair with no
modifications between still caused a database update.

We only modify message documents in a handful of places, so keep track
of whether the document has been modified and only sync it when
necessary.  This will be particularly important when we add message
revision tracking.
---
 lib/message.cc | 12 
 1 file changed, 12 insertions(+)

diff --git a/lib/message.cc b/lib/message.cc
index a7a13cc..cf2fd7c 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -43,6 +43,9 @@ struct visible _notmuch_message {
  * if each flag has been initialized. */
 unsigned long lazy_flags;
 
+/* Message document modified since last sync */
+notmuch_bool_t modified;
+
 Xapian::Document doc;
 Xapian::termcount termpos;
 };
@@ -538,6 +541,7 @@ _notmuch_message_remove_terms (notmuch_message_t *message, 
const char *prefix)
 
try {
message-doc.remove_term ((*i));
+   message-modified = TRUE;
} catch (const Xapian::InvalidArgumentError) {
/* Ignore failure to remove non-existent term. */
}
@@ -791,6 +795,7 @@ void
 _notmuch_message_clear_data (notmuch_message_t *message)
 {
 message-doc.set_data ();
+message-modified = TRUE;
 }
 
 static void
@@ -988,6 +993,7 @@ _notmuch_message_set_header_values (notmuch_message_t 
*message,
Xapian::sortable_serialise (time_value));
 message-doc.add_value (NOTMUCH_VALUE_FROM, from);
 message-doc.add_value (NOTMUCH_VALUE_SUBJECT, subject);
+message-modified = TRUE;
 }
 
 /* Synchronize changes made to message-doc out into the database. */
@@ -999,8 +1005,12 @@ _notmuch_message_sync (notmuch_message_t *message)
 if (message-notmuch-mode == NOTMUCH_DATABASE_MODE_READ_ONLY)
return;
 
+if (! message-modified)
+   return;
+
 db = static_cast Xapian::WritableDatabase * 
(message-notmuch-xapian_db);
 db-replace_document (message-doc_id, message-doc);
+message-modified = FALSE;
 }
 
 /* Delete a message document from the database. */
@@ -1075,6 +1085,7 @@ _notmuch_message_add_term (notmuch_message_t *message,
return NOTMUCH_PRIVATE_STATUS_TERM_TOO_LONG;
 
 message-doc.add_term (term, 0);
+message-modified = TRUE;
 
 talloc_free (term);
 
@@ -1143,6 +1154,7 @@ _notmuch_message_remove_term (notmuch_message_t *message,
 
 try {
message-doc.remove_term (term);
+   message-modified = TRUE;
 } catch (const Xapian::InvalidArgumentError) {
/* We'll let the philosopher's try to wrestle with the
 * question of whether failing to remove that which was not
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: VIM: search_refresh limits message count to 2 * window.height

2014-10-13 Thread Franz Fellner
The issue is that VIM::Buffer.render yield's itself BEFORE it clears
itself.
Two quick solutions:

1) Simply manually fixup the mess in StagedRender::initialize after
@b.render {do_next } by adding
@last_render = @b.count

2) First clear the VIM:Buffer before yielding. This exposes one issue in
Vims buffer handling: A newly created buffer has count==0, But after
the first line got added you cannot get count==0 again, so a refresh
currently ends up with an empty line at the beginning.
It is possible to get the empty line at the end by implementing
VIM::Buffer.() as append(count()-1, arg)
Of course one has to add one line now directly after creating a new
buffer.

Solution 1) would be a simple oneliner but IMHO looks a little bit hacky
;)
Solution 2) at first looks ugly because of the empty line at the
end/beginning. But it also adds the opportunity to print additional
information, like description of the columns (date, thread participants,
subject, ...) at the beginning, or something like end of search list,
end of thread at the end of the buffers.

Please tell me which one you like most and I can send a patch.


Regards
Franz

On Fri, 10 Oct 2014 17:56:23 +0200, Franz Fellner alpine.art...@gmail.com 
wrote:
 The reason is that StagedRender.is_ready depends on last_render, which
 get's set to VIM::Buffer.count() in StagedRender::do_next.
 I do not (yet) know what exactly happens, but after the first call to
 search refresh last_render never get's less than 2*2*window.height.
 That means once you do search_refresh StagedRender never will be ready -
 is_ready returns false, so s:show_cursor_moved never will advance the
 StagedRender.
 
 I am trying to understand the code, but it's a hard time for me ;)
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients}

2014-10-13 Thread Tomi Ollila
On Mon, Oct 13 2014, Michal Sojka sojk...@fel.cvut.cz wrote:

 The new outputs allow printing senders, recipients or both of matching
 messages. The --output option is converted from keyword argument to
 flags argument, which means that the user can use --output=sender and
 --output=recipients simultaneously, to print both. Other combinations
 produce an error.

 This code based on a patch from Jani Nikula.
 ---
  completion/notmuch-completion.bash |   2 +-
  completion/notmuch-completion.zsh  |   3 +-
  doc/man1/notmuch-search.rst|  22 +++-
  notmuch-search.c   | 110 
 ++---
  test/T090-search-output.sh |  64 +
  5 files changed, 189 insertions(+), 12 deletions(-)

 diff --git a/completion/notmuch-completion.bash 
 b/completion/notmuch-completion.bash
 index 0571dc9..cfbd389 100644
 --- a/completion/notmuch-completion.bash
 +++ b/completion/notmuch-completion.bash
 @@ -294,7 +294,7 @@ _notmuch_search()
   return
   ;;
   --output)
 - COMPREPLY=( $( compgen -W summary threads messages files tags -- 
 ${cur} ) )
 + COMPREPLY=( $( compgen -W summary threads messages files tags 
 sender recipients -- ${cur} ) )
   return
   ;;
   --sort)
 diff --git a/completion/notmuch-completion.zsh 
 b/completion/notmuch-completion.zsh
 index 67a9aba..3e52a00 100644
 --- a/completion/notmuch-completion.zsh
 +++ b/completion/notmuch-completion.zsh
 @@ -52,7 +52,8 @@ _notmuch_search()
_arguments -s : \
  '--max-threads=[display only the first x threads from the search 
 results]:number of threads to show: ' \
  '--first=[omit the first x threads from the search results]:number of 
 threads to omit: ' \
 -'--sort=[sort results]:sorting:((newest-first\:reverse chronological 
 order oldest-first\:chronological order))'
 +'--sort=[sort results]:sorting:((newest-first\:reverse chronological 
 order oldest-first\:chronological order))' \
 +'--output=[select what to output]:output:((summary threads messages 
 files tags sender recipients))'
  }
  
  _notmuch()
 diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
 index 90160f2..c9d38b1 100644
 --- a/doc/man1/notmuch-search.rst
 +++ b/doc/man1/notmuch-search.rst
 @@ -35,7 +35,7 @@ Supported options for **search** include
  intended for programs that invoke **notmuch(1)** internally. If
  omitted, the latest supported version will be used.
  
 -``--output=(summary|threads|messages|files|tags)``
 +``--output=(summary|threads|messages|files|tags|sender|recipients)``
  
  **summary**
  Output a summary of each thread with any message matching
 @@ -78,6 +78,26 @@ Supported options for **search** include
  by null characters (--format=text0), as a JSON array
  (--format=json), or as an S-Expression list (--format=sexp).
  
 + **sender**
 +Output all addresses from the *From* header that appear on
 +any message matching the search terms, either one per line
 +(--format=text), separated by null characters
 +(--format=text0), as a JSON array (--format=json), or as
 +an S-Expression list (--format=sexp).
 +
 + Note: Searching for **sender** should be much faster than
 + searching for **recipients**, because sender addresses are
 + cached directly in the database whereas other addresses
 + need to be fetched from message files.
 +
 + **recipients**
 +Like **sender** but for addresses from *To*, *Cc* and
 + *Bcc* headers.
 +
 + This option can be given multiple times to combine different
 + outputs. Curently, this is only supported for **sender** and
 + **recipients** outputs.
 +
  ``--sort=``\ (**newest-first**\ \|\ **oldest-first**)
  This option can be used to present results in either
  chronological order (**oldest-first**) or reverse chronological
 diff --git a/notmuch-search.c b/notmuch-search.c
 index 5ac2a26..74588f8 100644
 --- a/notmuch-search.c
 +++ b/notmuch-search.c
 @@ -23,11 +23,14 @@
  #include string-util.h
  
  typedef enum {
 -OUTPUT_SUMMARY,
 -OUTPUT_THREADS,
 -OUTPUT_MESSAGES,
 -OUTPUT_FILES,
 -OUTPUT_TAGS
 +OUTPUT_SUMMARY   = 1  0,
 +OUTPUT_THREADS   = 1  1,
 +OUTPUT_MESSAGES  = 1  2,
 +OUTPUT_FILES = 1  3,
 +OUTPUT_TAGS  = 1  4,
 +OUTPUT_SENDER= 1  5,
 +OUTPUT_RECIPIENTS= 1  6,
 +OUTPUT_ADDRESSES = OUTPUT_SENDER | OUTPUT_RECIPIENTS,

leftover, like mentioned below (this comment added just before sending)

  } output_t;
  
  typedef struct {
 @@ -220,6 +223,67 @@ do_search_threads (search_options_t *o)
  return 0;
  }
  
 +static void
 +print_address_list (const search_options_t *o, InternetAddressList *list)
 +{
 +InternetAddress *address;
 +int i;
 +
 +for (i = 0; i