[PATCH v7 1/3] Add support for structured output formatters.
From: This patch adds a new struct type sprinter_t, which is used for structured formatting, e.g. JSON or S-Expressions. The structure printer is heavily based on code from Austin Clements (id:87d34hsdx8@awakening.csail.mit.edu). It includes the following functions: /* Start a new map/dictionary structure. This should be followed by * a sequence of alternating calls to map_key and one of the * value-printing functions until the map is ended by end. */ void (*begin_map) (struct sprinter *); /* Start a new list/array structure. */ void (*begin_list) (struct sprinter *); /* End the last opened list or map structure. */ void (*end) (struct sprinter *); /* Print one string/integer/boolean/null element (possibly inside a * list or map, followed or preceded by separators). * For string, the char * must be UTF-8 encoded. */ void (*string) (struct sprinter *, const char *); void (*integer) (struct sprinter *, int); void (*boolean) (struct sprinter *, notmuch_bool_t); void (*null) (struct sprinter *); /* Print the key of a map's key/value pair. The char * must be UTF-8 * encoded. */ void (*map_key) (struct sprinter *, const char *); /* Insert a separator (usually extra whitespace) for improved * readability without affecting the abstract syntax of the * structure being printed. * For JSON, this could simply be a line break. */ void (*separator) (struct sprinter *); /* Set the current string prefix. This only affects the text * printer, which will print this string, followed by a colon, * before any string. For other printers, this does nothing. */ void (*set_prefix) (struct sprinter *, const char *); To support the plain text format properly, the following additional function must also be implemented: /* Set the current string prefix. This only affects the text * printer, which will print this string, followed by a colon, * before any string. For other printers, this does nothing. */ void (*set_prefix) (struct sprinter *, const char *); The structure also contains a flag that should be set to FALSE in all custom printers and to TRUE in the plain text formatter. /* True if this is the special-cased plain text printer. */ notmuch_bool_t is_text_printer; The printer can (and should) use internal state to insert delimiters and syntax at the correct places. Example: format->begin_map(format); format->map_key(format, "foo"); format->begin_list(format); format->integer(format, 1); format->integer(format, 2); format->integer(format, 3); format->end(format); format->map_key(format, "bar"); format->begin_map(format); format->map_key(format, "baaz"); format->string(format, "hello world"); format->end(format); format->end(format); would output JSON as follows: {"foo": [1, 2, 3], "bar": { "baaz": "hello world"}} --- sprinter.h | 58 ++ 1 file changed, 58 insertions(+) create mode 100644 sprinter.h diff --git a/sprinter.h b/sprinter.h new file mode 100644 index 000..77dc26f --- /dev/null +++ b/sprinter.h @@ -0,0 +1,58 @@ +#ifndef NOTMUCH_SPRINTER_H +#define NOTMUCH_SPRINTER_H + +/* Necessary for notmuch_bool_t */ +#include "notmuch-client.h" + +/* Structure printer interface. This is used to create output + * structured as maps (with key/value pairs), lists and primitives + * (strings, integers and booleans). + */ +typedef struct sprinter { +/* Start a new map/dictionary structure. This should be followed by + * a sequence of alternating calls to map_key and one of the + * value-printing functions until the map is ended by end. + */ +void (*begin_map) (struct sprinter *); + +/* Start a new list/array structure. + */ +void (*begin_list) (struct sprinter *); + +/* End the last opened list or map structure. + */ +void (*end) (struct sprinter *); + +/* Print one string/integer/boolean/null element (possibly inside a + * list or map, followed or preceded by separators). + * For string, the char * must be UTF-8 encoded. + */ +void (*string) (struct sprinter *, const char *); +void (*integer) (struct sprinter *, int); +void (*boolean) (struct sprinter *, notmuch_bool_t); +void (*null) (struct sprinter *); + +/* Print the key of a map's key/value pair. The char * must be UTF-8 + * encoded. + */ +void (*map_key) (struct sprinter *, const char *); + +/* Insert a separator (usually extra whitespace) for improved + * readability without affecting the abstract syntax of the + * structure being printed. + * For JSON, this could simply be a line break. + */ +void (*separator) (struct sprinter *); + +/* Set the current string prefix. This only affects the text + * printer, which will print this string, followed by a colon, + * before any string. For
[PATCH v7 2/3] Add structured output formatter for JSON and plain text (but don't use them yet).
From: Using the new structured printer support in sprinter.h, implement sprinter_json_create, which returns a new JSON structured output formatter. The formatter prints output similar to the existing JSON, but with differences in whitespace (mostly newlines, --output=summary prints the entire message summary on one line, not split across multiple lines). Also implement a "structured" formatter for plain text that prints prefixed strings, to be used with notmuch-search.c plain text output. --- Makefile.local | 2 + sprinter-json.c | 185 sprinter-text.c | 126 ++ sprinter.h | 10 +++ 4 files changed, 323 insertions(+) create mode 100644 sprinter-json.c create mode 100644 sprinter-text.c diff --git a/Makefile.local b/Makefile.local index a890df2..296995d 100644 --- a/Makefile.local +++ b/Makefile.local @@ -290,6 +290,8 @@ notmuch_client_srcs = \ notmuch-show.c \ notmuch-tag.c \ notmuch-time.c \ + sprinter-json.c \ + sprinter-text.c \ query-string.c \ mime-node.c \ crypto.c\ diff --git a/sprinter-json.c b/sprinter-json.c new file mode 100644 index 000..32daa5a --- /dev/null +++ b/sprinter-json.c @@ -0,0 +1,185 @@ +#include +#include +#include +#include "sprinter.h" + +struct sprinter_json { +struct sprinter vtable; +FILE *stream; +/* Top of the state stack, or NULL if the printer is not currently + * inside any aggregate types. */ +struct json_state *state; + +/* A flag to signify that a separator should be inserted in the + * output as soon as possible. + */ +notmuch_bool_t insert_separator; +}; + +struct json_state { +struct json_state *parent; +/* True if nothing has been printed in this aggregate yet. + * Suppresses the comma before a value. */ +notmuch_bool_t first; +/* The character that closes the current aggregate. */ +char close; +}; + +/* Helper function to set up the stream to print a value. If this + * value follows another value, prints a comma. */ +static struct sprinter_json * +json_begin_value (struct sprinter *sp) +{ +struct sprinter_json *spj = (struct sprinter_json *) sp; + +if (spj->state) { + if (! spj->state->first) { + fputc (',', spj->stream); + if (spj->insert_separator) { + fputc ('\n', spj->stream); + spj->insert_separator = FALSE; + } else + fputc (' ', spj->stream); + } else + spj->state->first = FALSE; +} +return spj; +} + +/* Helper function to begin an aggregate type. Prints the open + * character and pushes a new state frame. */ +static void +json_begin_aggregate (struct sprinter *sp, char open, char close) +{ +struct sprinter_json *spj = json_begin_value (sp); +struct json_state *state = talloc (spj, struct json_state); + +fputc (open, spj->stream); +state->parent = spj->state; +state->first = TRUE; +state->close = close; +spj->state = state; +} + +static void +json_begin_map (struct sprinter *sp) +{ +json_begin_aggregate (sp, '{', '}'); +} + +static void +json_begin_list (struct sprinter *sp) +{ +json_begin_aggregate (sp, '[', ']'); +} + +static void +json_end (struct sprinter *sp) +{ +struct sprinter_json *spj = (struct sprinter_json *) sp; +struct json_state *state = spj->state; + +fputc (spj->state->close, spj->stream); +spj->state = state->parent; +talloc_free (state); +if (spj->state == NULL) + fputc ('\n', spj->stream); +} + +static void +json_string (struct sprinter *sp, const char *val) +{ +static const char *const escapes[] = { + ['\"'] = "\\\"", ['\\'] = "", ['\b'] = "\\b", + ['\f'] = "\\f", ['\n'] = "\\n", ['\t'] = "\\t" +}; +struct sprinter_json *spj = json_begin_value (sp); + +fputc ('"', spj->stream); +for (; *val; ++val) { + unsigned char ch = *val; + if (ch < ARRAY_SIZE (escapes) && escapes[ch]) + fputs (escapes[ch], spj->stream); + else if (ch >= 32) + fputc (ch, spj->stream); + else + fprintf (spj->stream, "\\u%04x", ch); +} +fputc ('"', spj->stream); +} + +static void +json_integer (struct sprinter *sp, int val) +{ +struct sprinter_json *spj = json_begin_value (sp); + +fprintf (spj->stream, "%d", val); +} + +static void +json_boolean (struct sprinter *sp, notmuch_bool_t val) +{ +struct sprinter_json *spj = json_begin_value (sp); + +fputs (val ? "true" : "false", spj->stream); +} + +static void +json_null (struct sprinter *sp) +{ +struct sprinter_json *spj = json_begin_value (sp); + +fputs ("null", spj->stream); +} + +static void +json_map_key (struct sprinter *sp, const char *key) +{ +struct sprinter_json *spj = (struct sprinter_js
[PATCH v7 3/3] Use the structured formatters in notmuch-search.c.
From: This patch switches from the current ad-hoc printer to the structured formatters in sprinter.h, sprinter-text.c and sprinter-json.c. The JSON tests are changed slightly in order to make them PASS for the new structured output formatter. The text tests pass without adaptation. --- notmuch-search.c | 301 --- test/json| 34 --- 2 files changed, 103 insertions(+), 232 deletions(-) diff --git a/notmuch-search.c b/notmuch-search.c index 3be296d..07211e8 100644 --- a/notmuch-search.c +++ b/notmuch-search.c @@ -19,6 +19,7 @@ */ #include "notmuch-client.h" +#include "sprinter.h" typedef enum { OUTPUT_SUMMARY, @@ -28,92 +29,6 @@ typedef enum { OUTPUT_TAGS } output_t; -typedef struct search_format { -const char *results_start; -const char *item_start; -void (*item_id) (const void *ctx, -const char *item_type, -const char *item_id); -void (*thread_summary) (const void *ctx, - const char *thread_id, - const time_t date, - const int matched, - const int total, - const char *authors, - const char *subject); -const char *tag_start; -const char *tag; -const char *tag_sep; -const char *tag_end; -const char *item_sep; -const char *item_end; -const char *results_end; -const char *results_null; -} search_format_t; - -static void -format_item_id_text (const void *ctx, -const char *item_type, -const char *item_id); - -static void -format_thread_text (const void *ctx, - const char *thread_id, - const time_t date, - const int matched, - const int total, - const char *authors, - const char *subject); -static const search_format_t format_text = { -"", - "", - format_item_id_text, - format_thread_text, - " (", - "%s", " ", - ")", "\n", - "", -"\n", -"", -}; - -static void -format_item_id_json (const void *ctx, -const char *item_type, -const char *item_id); - -static void -format_thread_json (const void *ctx, - const char *thread_id, - const time_t date, - const int matched, - const int total, - const char *authors, - const char *subject); - -/* Any changes to the JSON format should be reflected in the file - * devel/schemata. */ -static const search_format_t format_json = { -"[", - "{", - format_item_id_json, - format_thread_json, - "\"tags\": [", - "\"%s\"", ", ", - "]", ",\n", - "}", -"]\n", -"]\n", -}; - -static void -format_item_id_text (unused (const void *ctx), -const char *item_type, -const char *item_id) -{ -printf ("%s%s", item_type, item_id); -} - static char * sanitize_string (const void *ctx, const char *str) { @@ -131,72 +46,8 @@ sanitize_string (const void *ctx, const char *str) return out; } -static void -format_thread_text (const void *ctx, - const char *thread_id, - const time_t date, - const int matched, - const int total, - const char *authors, - const char *subject) -{ -void *ctx_quote = talloc_new (ctx); - -printf ("thread:%s %12s [%d/%d] %s; %s", - thread_id, - notmuch_time_relative_date (ctx, date), - matched, - total, - sanitize_string (ctx_quote, authors), - sanitize_string (ctx_quote, subject)); - -talloc_free (ctx_quote); -} - -static void -format_item_id_json (const void *ctx, -unused (const char *item_type), -const char *item_id) -{ -void *ctx_quote = talloc_new (ctx); - -printf ("%s", json_quote_str (ctx_quote, item_id)); - -talloc_free (ctx_quote); - -} - -static void -format_thread_json (const void *ctx, - const char *thread_id, - const time_t date, - const int matched, - const int total, - const char *authors, - const char *subject) -{ -void *ctx_quote = talloc_new (ctx); - -printf ("\"thread\": %s,\n" - "\"timestamp\": %ld,\n" - "\"date_relative\": \"%s\",\n" - "\"matched\": %d,\n" - "\"total\": %d,\n" - "\"authors\": %s,\n" - "\"subject\": %s,\n", - json_quote_str (ctx_quote, thread_id), - date, - notmuch_time_relative_date (ctx, da
notmuch-reply: Structured Formatters
Currently there is no easy way to add support for different structured formatters (like JSON). For example, adding support for S-Expressions would result in code duplication. This patch series amends the situation by introducing structured formatters, which allow different implementations of structures like lists, maps, strings and numbers. The new code in sprinter.h and sprinter-json.c can be used instead of the current ad-hoc output in all parts of notmuch, a patch for notmuch-search.c is included. In a later patch, all other parts of notmuch should be adapted to the structured formatters, and the creation of formatters should be centralised (to make adding new formatters easier). A "structured" formatter is provided for notmuch-search that prints the current text format. This removes almost all the special-casing from notmuch-search.c. Overall diff --stat: Makefile.local | 2 + notmuch-search.c | 301 +-- sprinter-json.c | 185 + sprinter-text.c | 126 sprinter.h | 68 +++ test/json| 34 +++--- 6 files changed, 484 insertions(+), 232 deletions(-) Changes versus v6 of this patch: - is_text_printer is now a field, not a function. - minor formatting - sprinter_text_search has been renamed to sprinter_text (as it contains no search-specific code). - string sanitization removed from sprinter_text, the caller should sanitize the strings. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 3/7] go: Allow notmuch objects to be garbage collected
Quoth Adrien Bustany on Jul 19 at 9:25 pm: > Le 18/07/2012 23:40, Austin Clements a ?crit : > >This is subtle enough that I think it deserves a comment in the source > >code explaining that tracking the talloc owner reference, combined > >with the fact that Go finalizers are run in dependency order, ensures > >that the C objects will always be destroyed from the talloc leaves up. > > Definitely, I tend to comment in the commit message and forget about > the code... > > > > >Just one inline comment below. Otherwise, I think this is all > >correct. > > Agree with the comment, the Database should be the parent. I guess I > wasn't sure of the talloc parenting. > > > > >Is reproducing the talloc hierarchy in all of the bindings really the > >right approach? I feel like there has to be a better way (or that the > >way we use talloc in the library is slightly broken). What if the > >bindings created an additional talloc reference to each managed > >object, just to keep the object alive, and used talloc_unlink instead > >of the destroy functions? > > Reproducing the hierarchy is probably error prone, and not that > simple indeed :/ > I haven't checked at all the way you suggest, but if we use > talloc_reference/unlink, we get the same issue no? > - If we do for each new wrapped object talloc_reference(NULL, > wrapped_object), the the object will be kept alive until we > talloc_unlink(NULL, wrapped_object), but what about its parents? For > example will doing that on a notmuch_message_t keep the > notmuch_messages_t alive? Hmm. This is what I was thinking. You have an interesting point; I think it's slightly wrong, but it exposes something deeper. I believe there are two different things going on here: some of the talloc relationships are for convenience, while some are structural. In the former case, I'm pretty sure my suggestion will work, but in the latter case the objects should *never* be freed by the finalizer! For example, notmuch_query_search_messages returns a new notmuch_messages_t with the query as the talloc parent, but that notmuch_messages_t doesn't depend on the query object; this is just so you can conveniently delete everything retrieved from the query by deleting the query. In this case, you can either use parent references like you did---which will prevent a double-free by forcing destruction to happen from the leaves up but at the cost of having to encode these relationships and of extending the parent object lifetimes beyond what's strictly necessary---or you can use my suggestion of creating an additional talloc reference. However, in your example, the notmuch_message_t's are structurally related to the notmuch_messages_t from whence they came. They're all part of one data structure and hence it *never* makes sense for a caller to delete the notmuch_message_t's. For example, even with the code in this patch, I think the following could lead to a crash: 1. Obtain a Messages object, say ms. 2. m1 := ms.Get() 3. m1 = nil 4. m2 := ms.Get() 5. m2.whatever() If a garbage collection happens between steps 3 and 4, the Message in m1 will get finalized and destroyed. But step 4 will return the same, now dangling, pointer, leading to a potential crash in step 5. Maybe the answer in the structural case is to include the parent pointer in the Go struct and not set a finalizer on the child? That way, if there's a Go reference to the parent wrapper, it won't go away and the children won't get destroyed (collecting wrappers of children is fine) and if there's a Go reference to the child wrapper, it will keep the parent alive so it won't get destroyed and neither will the child. > - If we do talloc_reference(parent, wrapped), then we reproduce the > hierarchy again? > > Note that I have 0 experience with talloc, so I might as well be > getting things wrong here. > > > > >Quoth Adrien Bustany on Jul 18 at 9:34 pm: > >>This makes notmuch appropriately free the underlying notmuch C objects > >>when garbage collecting their Go wrappers. To make sure we don't break > >>the underlying links between objects (for example, a notmuch_messages_t > >>being GC'ed before a notmuch_message_t belonging to it), we add for each > >>wraper struct a pointer to the owner object (Go objects with a reference > >>pointing to them don't get garbage collected). > >>--- > >> bindings/go/src/notmuch/notmuch.go | 153 > >> +++- > >> 1 files changed, 134 insertions(+), 19 deletions(-) > >> > >>diff --git a/bindings/go/src/notmuch/notmuch.go > >>b/bindings/go/src/notmuch/notmuch.go > >>index 1d77fd2..3f436a0 100644 > >>--- a/bindings/go/src/notmuch/notmuch.go > >>+++ b/bindings/go/src/notmuch/notmuch.go > >>@@ -11,6 +11,7 @@ package notmuch > >> #include "notmuch.h" > >> */ > >> import "C" > >>+import "runtime" > >> import "unsafe" > >> > >> // Status codes used for the return values of most functions > >>@@ -47,40 +48,152 @@ func (self Status) String() string { > >> /* V
[PATCH 2/2] Add notmuch_database_reopen method
Calling notmuch_database_reopen is needed to refresh the database contents when the database on disk was modified by another notmuch_database_t instance, for example in a different thread. --- lib/database.cc | 17 + lib/notmuch.h |8 2 files changed, 25 insertions(+), 0 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index 55bcd17..3be5a30 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -763,6 +763,23 @@ notmuch_database_flush(notmuch_database_t *notmuch) return status; } +notmuch_status_t +notmuch_database_reopen(notmuch_database_t *notmuch) +{ + notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; + + try { + if (notmuch->xapian_db != NULL) + (notmuch->xapian_db)->reopen (); + } catch (const Xapian::Error &error) { + fprintf(stderr, "A Xapian exception occured reopening the database: %s\n", + error.get_msg().c_str()); + status = NOTMUCH_STATUS_XAPIAN_EXCEPTION; + } + + return status; +} + void notmuch_database_close (notmuch_database_t *notmuch) { diff --git a/lib/notmuch.h b/lib/notmuch.h index aef5c56..51d6a9a 100644 --- a/lib/notmuch.h +++ b/lib/notmuch.h @@ -205,6 +205,14 @@ notmuch_database_open (const char *path, notmuch_status_t notmuch_database_flush (notmuch_database_t *database); +/* Refresh the database contents to the latest version. + * + * This is needed only if another instance of notmuch_database_t has + * modified the database contents on disk. + */ +notmuch_status_t +notmuch_database_reopen (notmuch_database_t *database); + /* Close the given notmuch database. * * After notmuch_database_close has been called, calls to other -- 1.7.7.6
[PATCH 1/2] Add notmuch_database_flush method
This method explicitly flushes the pending modifications to disk. It is useful if your program has various threads, each with a read only DB and one writer thread with a read/write DB. In that case, you most likely want the writer to sync the changes to disk so that the readers can see them, without having to close and reopen the database completely. --- lib/database.cc | 18 ++ lib/notmuch.h |4 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index 761dc1a..55bcd17 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -745,6 +745,24 @@ notmuch_database_open (const char *path, return status; } +notmuch_status_t +notmuch_database_flush(notmuch_database_t *notmuch) +{ + notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; + + try { + if (notmuch->xapian_db != NULL && + notmuch->mode == NOTMUCH_DATABASE_MODE_READ_WRITE) + (static_cast (notmuch->xapian_db))->flush (); + } catch (const Xapian::Error &error) { + fprintf(stderr, "A Xapian exception occured flushing the database: %s\n", + error.get_msg().c_str()); + status = NOTMUCH_STATUS_XAPIAN_EXCEPTION; + } + + return status; +} + void notmuch_database_close (notmuch_database_t *notmuch) { diff --git a/lib/notmuch.h b/lib/notmuch.h index 3633bed..aef5c56 100644 --- a/lib/notmuch.h +++ b/lib/notmuch.h @@ -201,6 +201,10 @@ notmuch_database_open (const char *path, notmuch_database_mode_t mode, notmuch_database_t **database); +/* Flushes all the pending modifications to the database to disk. */ +notmuch_status_t +notmuch_database_flush (notmuch_database_t *database); + /* Close the given notmuch database. * * After notmuch_database_close has been called, calls to other -- 1.7.7.6
[PATCH 0/2] Add flush/reopen methods to notmuch_database_t
Xapian (and therefore Notmuch? I would not mind a confirmation here) is thread safe as long as you don't share DB instances across threads. The database backend supports one writer and several concurrent readers. One possible model for multi threaded apps using Notmuch is to have one thread with a writable notmuch_database_t, and various with read-only notmuch_database_t instances. However, for the readers to see the changes done by the writer, the writer needs to flush the changes to disk, and the readers need to reopen the database. Those two patches add two methods to notmuch_database_t for this purpose (the signaling of the writer to the readers that they need to reopen is left to the application). Adrien Bustany (2): Add notmuch_database_flush method Add notmuch_database_reopen method lib/database.cc | 35 +++ lib/notmuch.h | 12 2 files changed, 47 insertions(+), 0 deletions(-) -- 1.7.7.6
[PATCH 0/7] Various fixes for the Go bindings
Well, thanks for the quick review :) I have updated all the patches to take your fixes into account, there is still the discussion left about the GC integration. Once we get that sorted out, I'll send the updated version. Le 18/07/2012 23:51, Austin Clements a ?crit : > This series looks good to me other than the few things I commented on. > It's nice to see the Go bindings get a bit of love! > > Quoth Adrien Bustany on Jul 18 at 9:34 pm: >> The following patches fix some serious memory management issues with >> the Go bindings, and add some missing functions as well. >> >> Adrien Bustany (7): >>go: Use iota in enum bindings >>go: Add missing MESSAGE_FLAG_EXCLUDED in Flag enum >>go: Allow notmuch objects to be garbage collected >>go: Make Destroy functions safe to call several times >>go: Partially bind notmuch_database_upgrade >>go: Bind notmuch_database_find_message_by_filename >>go: Bind notmuch_thread_t functions >> >> bindings/go/src/notmuch/notmuch.go | 471 >> ++-- >> 1 files changed, 447 insertions(+), 24 deletions(-) >>
[PATCH 3/7] go: Allow notmuch objects to be garbage collected
Le 18/07/2012 23:40, Austin Clements a ?crit : > This is subtle enough that I think it deserves a comment in the source > code explaining that tracking the talloc owner reference, combined > with the fact that Go finalizers are run in dependency order, ensures > that the C objects will always be destroyed from the talloc leaves up. Definitely, I tend to comment in the commit message and forget about the code... > > Just one inline comment below. Otherwise, I think this is all > correct. Agree with the comment, the Database should be the parent. I guess I wasn't sure of the talloc parenting. > > Is reproducing the talloc hierarchy in all of the bindings really the > right approach? I feel like there has to be a better way (or that the > way we use talloc in the library is slightly broken). What if the > bindings created an additional talloc reference to each managed > object, just to keep the object alive, and used talloc_unlink instead > of the destroy functions? Reproducing the hierarchy is probably error prone, and not that simple indeed :/ I haven't checked at all the way you suggest, but if we use talloc_reference/unlink, we get the same issue no? - If we do for each new wrapped object talloc_reference(NULL, wrapped_object), the the object will be kept alive until we talloc_unlink(NULL, wrapped_object), but what about its parents? For example will doing that on a notmuch_message_t keep the notmuch_messages_t alive? - If we do talloc_reference(parent, wrapped), then we reproduce the hierarchy again? Note that I have 0 experience with talloc, so I might as well be getting things wrong here. > > Quoth Adrien Bustany on Jul 18 at 9:34 pm: >> This makes notmuch appropriately free the underlying notmuch C objects >> when garbage collecting their Go wrappers. To make sure we don't break >> the underlying links between objects (for example, a notmuch_messages_t >> being GC'ed before a notmuch_message_t belonging to it), we add for each >> wraper struct a pointer to the owner object (Go objects with a reference >> pointing to them don't get garbage collected). >> --- >> bindings/go/src/notmuch/notmuch.go | 153 >> +++- >> 1 files changed, 134 insertions(+), 19 deletions(-) >> >> diff --git a/bindings/go/src/notmuch/notmuch.go >> b/bindings/go/src/notmuch/notmuch.go >> index 1d77fd2..3f436a0 100644 >> --- a/bindings/go/src/notmuch/notmuch.go >> +++ b/bindings/go/src/notmuch/notmuch.go >> @@ -11,6 +11,7 @@ package notmuch >> #include "notmuch.h" >> */ >> import "C" >> +import "runtime" >> import "unsafe" >> >> // Status codes used for the return values of most functions >> @@ -47,40 +48,152 @@ func (self Status) String() string { >> /* Various opaque data types. For each notmuch__t see the various >>* notmuch_ functions below. */ >> >> +type Object interface {} >> + >> type Database struct { >> db *C.notmuch_database_t >> } >> >> +func createDatabase(db *C.notmuch_database_t) *Database { >> +self := &Database{db: db} >> + >> +runtime.SetFinalizer(self, func(x *Database) { >> +if (x.db != nil) { >> +C.notmuch_database_destroy(x.db) >> +} >> +}) >> + >> +return self >> +} >> + >> type Query struct { >> query *C.notmuch_query_t >> +owner Object >> +} >> + >> +func createQuery(query *C.notmuch_query_t, owner Object) *Query { >> +self := &Query{query: query, owner: owner} >> + >> +runtime.SetFinalizer(self, func(x *Query) { >> +if (x.query != nil) { >> +C.notmuch_query_destroy(x.query) >> +} >> +}) >> + >> +return self >> } >> >> type Threads struct { >> threads *C.notmuch_threads_t >> +owner Object >> +} >> + >> +func createThreads(threads *C.notmuch_threads_t, owner Object) *Threads { >> +self := &Threads{threads: threads, owner: owner} >> + >> +runtime.SetFinalizer(self, func(x *Threads) { >> +if (x.threads != nil) { >> +C.notmuch_threads_destroy(x.threads) >> +} >> +}) >> + >> +return self >> } >> >> type Thread struct { >> thread *C.notmuch_thread_t >> +owner Object >> +} >> + >> +func createThread(thread *C.notmuch_thread_t, owner Object) *Thread { >> +self := &Thread{thread: thread, owner: owner} >> + >> +runtime.SetFinalizer(self, func(x *Thread) { >> +if (x.thread != nil) { >> +C.notmuch_thread_destroy(x.thread) >> +} >> +}) >> + >> +return self >> } >> >> type Messages struct { >> messages *C.notmuch_messages_t >> +owner Object >> +} >> + >> +func createMessages(messages *C.notmuch_messages_t, owner Object) *Messages >> { >> +self := &Messages{messages: messages, owner: owner} >> + >> +return self >> } >> >> type Message struct { >> message *C.notmuch_message_t >> +owner Object >> +} >> + >> +func createMessage(message
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Quoth Adrien Bustany on Jul 19 at 9:25 pm: > Le 18/07/2012 23:40, Austin Clements a écrit : > >This is subtle enough that I think it deserves a comment in the source > >code explaining that tracking the talloc owner reference, combined > >with the fact that Go finalizers are run in dependency order, ensures > >that the C objects will always be destroyed from the talloc leaves up. > > Definitely, I tend to comment in the commit message and forget about > the code... > > > > >Just one inline comment below. Otherwise, I think this is all > >correct. > > Agree with the comment, the Database should be the parent. I guess I > wasn't sure of the talloc parenting. > > > > >Is reproducing the talloc hierarchy in all of the bindings really the > >right approach? I feel like there has to be a better way (or that the > >way we use talloc in the library is slightly broken). What if the > >bindings created an additional talloc reference to each managed > >object, just to keep the object alive, and used talloc_unlink instead > >of the destroy functions? > > Reproducing the hierarchy is probably error prone, and not that > simple indeed :/ > I haven't checked at all the way you suggest, but if we use > talloc_reference/unlink, we get the same issue no? > - If we do for each new wrapped object talloc_reference(NULL, > wrapped_object), the the object will be kept alive until we > talloc_unlink(NULL, wrapped_object), but what about its parents? For > example will doing that on a notmuch_message_t keep the > notmuch_messages_t alive? Hmm. This is what I was thinking. You have an interesting point; I think it's slightly wrong, but it exposes something deeper. I believe there are two different things going on here: some of the talloc relationships are for convenience, while some are structural. In the former case, I'm pretty sure my suggestion will work, but in the latter case the objects should *never* be freed by the finalizer! For example, notmuch_query_search_messages returns a new notmuch_messages_t with the query as the talloc parent, but that notmuch_messages_t doesn't depend on the query object; this is just so you can conveniently delete everything retrieved from the query by deleting the query. In this case, you can either use parent references like you did---which will prevent a double-free by forcing destruction to happen from the leaves up but at the cost of having to encode these relationships and of extending the parent object lifetimes beyond what's strictly necessary---or you can use my suggestion of creating an additional talloc reference. However, in your example, the notmuch_message_t's are structurally related to the notmuch_messages_t from whence they came. They're all part of one data structure and hence it *never* makes sense for a caller to delete the notmuch_message_t's. For example, even with the code in this patch, I think the following could lead to a crash: 1. Obtain a Messages object, say ms. 2. m1 := ms.Get() 3. m1 = nil 4. m2 := ms.Get() 5. m2.whatever() If a garbage collection happens between steps 3 and 4, the Message in m1 will get finalized and destroyed. But step 4 will return the same, now dangling, pointer, leading to a potential crash in step 5. Maybe the answer in the structural case is to include the parent pointer in the Go struct and not set a finalizer on the child? That way, if there's a Go reference to the parent wrapper, it won't go away and the children won't get destroyed (collecting wrappers of children is fine) and if there's a Go reference to the child wrapper, it will keep the parent alive so it won't get destroyed and neither will the child. > - If we do talloc_reference(parent, wrapped), then we reproduce the > hierarchy again? > > Note that I have 0 experience with talloc, so I might as well be > getting things wrong here. > > > > >Quoth Adrien Bustany on Jul 18 at 9:34 pm: > >>This makes notmuch appropriately free the underlying notmuch C objects > >>when garbage collecting their Go wrappers. To make sure we don't break > >>the underlying links between objects (for example, a notmuch_messages_t > >>being GC'ed before a notmuch_message_t belonging to it), we add for each > >>wraper struct a pointer to the owner object (Go objects with a reference > >>pointing to them don't get garbage collected). > >>--- > >> bindings/go/src/notmuch/notmuch.go | 153 > >> +++- > >> 1 files changed, 134 insertions(+), 19 deletions(-) > >> > >>diff --git a/bindings/go/src/notmuch/notmuch.go > >>b/bindings/go/src/notmuch/notmuch.go > >>index 1d77fd2..3f436a0 100644 > >>--- a/bindings/go/src/notmuch/notmuch.go > >>+++ b/bindings/go/src/notmuch/notmuch.go > >>@@ -11,6 +11,7 @@ package notmuch > >> #include "notmuch.h" > >> */ > >> import "C" > >>+import "runtime" > >> import "unsafe" > >> > >> // Status codes used for the return values of most functions > >>@@ -47,40 +48,152 @@ func (self Status) String() string { > >> /* V
[PATCH 2/2] Add notmuch_database_reopen method
Calling notmuch_database_reopen is needed to refresh the database contents when the database on disk was modified by another notmuch_database_t instance, for example in a different thread. --- lib/database.cc | 17 + lib/notmuch.h |8 2 files changed, 25 insertions(+), 0 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index 55bcd17..3be5a30 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -763,6 +763,23 @@ notmuch_database_flush(notmuch_database_t *notmuch) return status; } +notmuch_status_t +notmuch_database_reopen(notmuch_database_t *notmuch) +{ + notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; + + try { + if (notmuch->xapian_db != NULL) + (notmuch->xapian_db)->reopen (); + } catch (const Xapian::Error &error) { + fprintf(stderr, "A Xapian exception occured reopening the database: %s\n", + error.get_msg().c_str()); + status = NOTMUCH_STATUS_XAPIAN_EXCEPTION; + } + + return status; +} + void notmuch_database_close (notmuch_database_t *notmuch) { diff --git a/lib/notmuch.h b/lib/notmuch.h index aef5c56..51d6a9a 100644 --- a/lib/notmuch.h +++ b/lib/notmuch.h @@ -205,6 +205,14 @@ notmuch_database_open (const char *path, notmuch_status_t notmuch_database_flush (notmuch_database_t *database); +/* Refresh the database contents to the latest version. + * + * This is needed only if another instance of notmuch_database_t has + * modified the database contents on disk. + */ +notmuch_status_t +notmuch_database_reopen (notmuch_database_t *database); + /* Close the given notmuch database. * * After notmuch_database_close has been called, calls to other -- 1.7.7.6 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 0/2] Add flush/reopen methods to notmuch_database_t
Xapian (and therefore Notmuch? I would not mind a confirmation here) is thread safe as long as you don't share DB instances across threads. The database backend supports one writer and several concurrent readers. One possible model for multi threaded apps using Notmuch is to have one thread with a writable notmuch_database_t, and various with read-only notmuch_database_t instances. However, for the readers to see the changes done by the writer, the writer needs to flush the changes to disk, and the readers need to reopen the database. Those two patches add two methods to notmuch_database_t for this purpose (the signaling of the writer to the readers that they need to reopen is left to the application). Adrien Bustany (2): Add notmuch_database_flush method Add notmuch_database_reopen method lib/database.cc | 35 +++ lib/notmuch.h | 12 2 files changed, 47 insertions(+), 0 deletions(-) -- 1.7.7.6 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 1/2] Add notmuch_database_flush method
This method explicitly flushes the pending modifications to disk. It is useful if your program has various threads, each with a read only DB and one writer thread with a read/write DB. In that case, you most likely want the writer to sync the changes to disk so that the readers can see them, without having to close and reopen the database completely. --- lib/database.cc | 18 ++ lib/notmuch.h |4 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index 761dc1a..55bcd17 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -745,6 +745,24 @@ notmuch_database_open (const char *path, return status; } +notmuch_status_t +notmuch_database_flush(notmuch_database_t *notmuch) +{ + notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; + + try { + if (notmuch->xapian_db != NULL && + notmuch->mode == NOTMUCH_DATABASE_MODE_READ_WRITE) + (static_cast (notmuch->xapian_db))->flush (); + } catch (const Xapian::Error &error) { + fprintf(stderr, "A Xapian exception occured flushing the database: %s\n", + error.get_msg().c_str()); + status = NOTMUCH_STATUS_XAPIAN_EXCEPTION; + } + + return status; +} + void notmuch_database_close (notmuch_database_t *notmuch) { diff --git a/lib/notmuch.h b/lib/notmuch.h index 3633bed..aef5c56 100644 --- a/lib/notmuch.h +++ b/lib/notmuch.h @@ -201,6 +201,10 @@ notmuch_database_open (const char *path, notmuch_database_mode_t mode, notmuch_database_t **database); +/* Flushes all the pending modifications to the database to disk. */ +notmuch_status_t +notmuch_database_flush (notmuch_database_t *database); + /* Close the given notmuch database. * * After notmuch_database_close has been called, calls to other -- 1.7.7.6 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 0/7] Various fixes for the Go bindings
Well, thanks for the quick review :) I have updated all the patches to take your fixes into account, there is still the discussion left about the GC integration. Once we get that sorted out, I'll send the updated version. Le 18/07/2012 23:51, Austin Clements a écrit : This series looks good to me other than the few things I commented on. It's nice to see the Go bindings get a bit of love! Quoth Adrien Bustany on Jul 18 at 9:34 pm: The following patches fix some serious memory management issues with the Go bindings, and add some missing functions as well. Adrien Bustany (7): go: Use iota in enum bindings go: Add missing MESSAGE_FLAG_EXCLUDED in Flag enum go: Allow notmuch objects to be garbage collected go: Make Destroy functions safe to call several times go: Partially bind notmuch_database_upgrade go: Bind notmuch_database_find_message_by_filename go: Bind notmuch_thread_t functions bindings/go/src/notmuch/notmuch.go | 471 ++-- 1 files changed, 447 insertions(+), 24 deletions(-) ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Le 18/07/2012 23:40, Austin Clements a écrit : This is subtle enough that I think it deserves a comment in the source code explaining that tracking the talloc owner reference, combined with the fact that Go finalizers are run in dependency order, ensures that the C objects will always be destroyed from the talloc leaves up. Definitely, I tend to comment in the commit message and forget about the code... Just one inline comment below. Otherwise, I think this is all correct. Agree with the comment, the Database should be the parent. I guess I wasn't sure of the talloc parenting. Is reproducing the talloc hierarchy in all of the bindings really the right approach? I feel like there has to be a better way (or that the way we use talloc in the library is slightly broken). What if the bindings created an additional talloc reference to each managed object, just to keep the object alive, and used talloc_unlink instead of the destroy functions? Reproducing the hierarchy is probably error prone, and not that simple indeed :/ I haven't checked at all the way you suggest, but if we use talloc_reference/unlink, we get the same issue no? - If we do for each new wrapped object talloc_reference(NULL, wrapped_object), the the object will be kept alive until we talloc_unlink(NULL, wrapped_object), but what about its parents? For example will doing that on a notmuch_message_t keep the notmuch_messages_t alive? - If we do talloc_reference(parent, wrapped), then we reproduce the hierarchy again? Note that I have 0 experience with talloc, so I might as well be getting things wrong here. Quoth Adrien Bustany on Jul 18 at 9:34 pm: This makes notmuch appropriately free the underlying notmuch C objects when garbage collecting their Go wrappers. To make sure we don't break the underlying links between objects (for example, a notmuch_messages_t being GC'ed before a notmuch_message_t belonging to it), we add for each wraper struct a pointer to the owner object (Go objects with a reference pointing to them don't get garbage collected). --- bindings/go/src/notmuch/notmuch.go | 153 +++- 1 files changed, 134 insertions(+), 19 deletions(-) diff --git a/bindings/go/src/notmuch/notmuch.go b/bindings/go/src/notmuch/notmuch.go index 1d77fd2..3f436a0 100644 --- a/bindings/go/src/notmuch/notmuch.go +++ b/bindings/go/src/notmuch/notmuch.go @@ -11,6 +11,7 @@ package notmuch #include "notmuch.h" */ import "C" +import "runtime" import "unsafe" // Status codes used for the return values of most functions @@ -47,40 +48,152 @@ func (self Status) String() string { /* Various opaque data types. For each notmuch__t see the various * notmuch_ functions below. */ +type Object interface {} + type Database struct { db *C.notmuch_database_t } +func createDatabase(db *C.notmuch_database_t) *Database { + self := &Database{db: db} + + runtime.SetFinalizer(self, func(x *Database) { + if (x.db != nil) { + C.notmuch_database_destroy(x.db) + } + }) + + return self +} + type Query struct { query *C.notmuch_query_t + owner Object +} + +func createQuery(query *C.notmuch_query_t, owner Object) *Query { + self := &Query{query: query, owner: owner} + + runtime.SetFinalizer(self, func(x *Query) { + if (x.query != nil) { + C.notmuch_query_destroy(x.query) + } + }) + + return self } type Threads struct { threads *C.notmuch_threads_t + owner Object +} + +func createThreads(threads *C.notmuch_threads_t, owner Object) *Threads { + self := &Threads{threads: threads, owner: owner} + + runtime.SetFinalizer(self, func(x *Threads) { + if (x.threads != nil) { + C.notmuch_threads_destroy(x.threads) + } + }) + + return self } type Thread struct { thread *C.notmuch_thread_t + owner Object +} + +func createThread(thread *C.notmuch_thread_t, owner Object) *Thread { + self := &Thread{thread: thread, owner: owner} + + runtime.SetFinalizer(self, func(x *Thread) { + if (x.thread != nil) { + C.notmuch_thread_destroy(x.thread) + } + }) + + return self } type Messages struct { messages *C.notmuch_messages_t + owner Object +} + +func createMessages(messages *C.notmuch_messages_t, owner Object) *Messages { + self := &Messages{messages: messages, owner: owner} + + return self } type Message struct { message *C.notmuch_message_t + owner Object +} + +func createMessage(message *C.notmuch_message_t, owner Object) *Message { + self := &Message{message: message, owner: owner} + + runtime.SetFinalizer(self, func(x *Message) { + if (x.message != nil) { + C.notmuc
query on a subset of messages ?
Sebastien Binet writes: > Austin Clements writes: > >> Quoth Sebastien Binet on Jul 09 at 10:25 am: >>> >>> hi there, >>> >>> I was trying to reduce the I/O stress during my usual email >>> fetching+tagging by writing a little program using the go bindings to >>> notmuch. >>> >>> ie: >>> db, status := notmuch.OpenDatabase(db_path, >>> notmuch.DATABASE_MODE_READ_WRITE) >>> query := db.CreateQuery("(tag:new AND tag:inbox)") >>> msgs := query.SearchMessages() >>> for _,msg := range msgs { >>> tag_msg(msg, tagqueries) >>> } >>> >>> >>> where tagqueries is a subquery of the form: >>> [ >>> { >>> "Cmd": "+to-me", >>> "Query": "(to:sebastien.binet at cern.ch and not tag:to-me)" >>> }, >>> { >>> "Cmd": "+sci-notmuch", >>> "Query": "from:notmuch at notmuchmail.org or to:notmuch at >>> notmuchmail.org or subject:notmuch" >>> } >>> ] >>> >>> >>> the idea being that I only need to crawl through the db only once and >>> then iteratively apply tags on those messages (instead of repeatedly >>> running "notmuch tag ..." for each and every of those many >>> 'tag-queries') >>> >>> I couldn't find any C-API to do such a thing using the notmuch library. >>> did I overlook something ? >>> >>> Is it something useful to add ? >>> >>> -s >> >> Have you tried a more direct translation of the multiple notmuch tag >> commands into Go, where you don't worry about subsetting the queries? >> Unless you're tagging a huge number of messages, the cost of notmuch >> tag is almost certainly the fsync that it does when it closes the >> database (which every call to notmuch tag must do). However, in Go, >> you can keep the database open across all of the tagging operations >> and then close and fsync it just once. > > nope, I haven't tried that, but will do. > >> >> Note that there is an important optimization in notmuch tag that you >> might have to replicate. It manipulates the original query to exclude >> messages that already have the desired tags, so that they get skipped >> very efficiently at the earliest stage possible. > I already have this in my original shell script. > (wouldn't be too hard to automatically do, though.) FYI, I've put this into a new notmuch-mtag go-based binary over here: https://github.com/sbinet/notmuch/blob/dev/go-bindings/bindings/go/src/notmuch-mtag/main.go -s -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20120719/4a91371d/attachment-0001.pgp>
Notmuch scripts (again), now with more usenet
ccx at webprojekty.cz writes: > > I might not always have time to read the entirety of the ML, so it would > be great if people CC me when posting there. > OK, please add some kind of README file explaining the best way to report bugs. > > You can consider it public domain. I guess I should add wtfpl text to my > repo. :-) Please do. or CC0 if you prefer less swearing ;). d
Re: Notmuch scripts (again), now with more usenet
c...@webprojekty.cz writes: > > I might not always have time to read the entirety of the ML, so it would > be great if people CC me when posting there. > OK, please add some kind of README file explaining the best way to report bugs. > > You can consider it public domain. I guess I should add wtfpl text to my > repo. :-) Please do. or CC0 if you prefer less swearing ;). d ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: query on a subset of messages ?
Sebastien Binet writes: > Austin Clements writes: > >> Quoth Sebastien Binet on Jul 09 at 10:25 am: >>> >>> hi there, >>> >>> I was trying to reduce the I/O stress during my usual email >>> fetching+tagging by writing a little program using the go bindings to >>> notmuch. >>> >>> ie: >>> db, status := notmuch.OpenDatabase(db_path, >>> notmuch.DATABASE_MODE_READ_WRITE) >>> query := db.CreateQuery("(tag:new AND tag:inbox)") >>> msgs := query.SearchMessages() >>> for _,msg := range msgs { >>> tag_msg(msg, tagqueries) >>> } >>> >>> >>> where tagqueries is a subquery of the form: >>> [ >>> { >>> "Cmd": "+to-me", >>> "Query": "(to:sebastien.bi...@cern.ch and not tag:to-me)" >>> }, >>> { >>> "Cmd": "+sci-notmuch", >>> "Query": "from:notmuch@notmuchmail.org or >>> to:notmuch@notmuchmail.org or subject:notmuch" >>> } >>> ] >>> >>> >>> the idea being that I only need to crawl through the db only once and >>> then iteratively apply tags on those messages (instead of repeatedly >>> running "notmuch tag ..." for each and every of those many >>> 'tag-queries') >>> >>> I couldn't find any C-API to do such a thing using the notmuch library. >>> did I overlook something ? >>> >>> Is it something useful to add ? >>> >>> -s >> >> Have you tried a more direct translation of the multiple notmuch tag >> commands into Go, where you don't worry about subsetting the queries? >> Unless you're tagging a huge number of messages, the cost of notmuch >> tag is almost certainly the fsync that it does when it closes the >> database (which every call to notmuch tag must do). However, in Go, >> you can keep the database open across all of the tagging operations >> and then close and fsync it just once. > > nope, I haven't tried that, but will do. > >> >> Note that there is an important optimization in notmuch tag that you >> might have to replicate. It manipulates the original query to exclude >> messages that already have the desired tags, so that they get skipped >> very efficiently at the earliest stage possible. > I already have this in my original shell script. > (wouldn't be too hard to automatically do, though.) FYI, I've put this into a new notmuch-mtag go-based binary over here: https://github.com/sbinet/notmuch/blob/dev/go-bindings/bindings/go/src/notmuch-mtag/main.go -s pgplTbd26fv2U.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch