[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-22 Thread Peter Feigl
On Sat, 21 Jan 2012 15:12:57 -0800, Jameson Graef Rollins  wrote:
> On Sat, 21 Jan 2012 22:16:08 +0100, Peter Feigl  wrote:
> > The output routines have been rewritten so that logical structure
> > (objects with key/value pairs, arrays, strings and numbers) are
> > written instead of ad-hoc printfs. This allows for easier adaptation
> > of other output formats, as only the routines that start/end an object
> > etc. have to be rewritten. The logic is the same for all formats.
> > The default text output is handled differently, special cases are
> > inserted at the proper places, as it differs too much from the
> > structured output.
> 
> Hi, Peter.  Thanks for the contribution.
> 
> There are a lot of changes in this patch so I think you need to think
> about how you can break this up into multiple smaller and more atomic
> patches.  In particular, the addition of the sexp output format needs to
> definitely be in a separate patch from the restructuring of the output
> formatting.  You also don't mention anywhere in the commit log that
> you've added this new output format.  You'll also need to include
> documentation and test suite updates.

I'm sorry I forgot to mention that, it was mainly meant as a way to show
that this is easily possible (i.e. that the formatting is decoupled from
the logic, so that additional and different formats can be added without
influencing the printing logic). It'd be easy to split this up. What
kind of documentation should I include? 
The test suite should work fine, *if* it compares EXPECTED and OUTPUT
not character-by-character, but rather by pretty-printing both the
expected and the actual outputs by some JSON pretty-printer (like python
-mjson.tool). I can of course provide additional test-cases for
--format=sexp.

How should I proceed on this? Re-submit the patch with the sexp-support
removed and only JSON updated?

Thanks!

Peter


[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-22 Thread Peter Feigl
On Sat, 21 Jan 2012 17:04:07 -0500, Austin Clements  wrote:
> Quoth Peter Feigl on Jan 21 at 10:16 pm:
> I think this is a great idea and I'm a fan of having an S-expression
> format, but I also think there's a much easier and more general way to
> structure this.
> 
> In particular, I don't think you should hijack search_format, since
> you'll wind up repeating most of this work for anything else that
> needs to output structured data (notmuch show, at least).  Rather, I'd
> suggest creating a general "structure printer" struct that isn't tied
> to search.  You've essentially already done this, you just called it
> search_format.  Then, leave the existing format callbacks in place,
> just use this new API instead of lots of printf calls.

I'm sorry I haven't been more clear about this, the intention was all
along to check whether this would be ok in notmuch-search, and if it got
accepted there, to factor it out into a separate file and then use it in
notmuch-show and notmuch-reply. There's nothing in the printer (except
for the name of the struct) that ties it to search.

> What about all of those annoying {tag,item}_{start,sep,end} fields?  I
> think you can simultaneously get rid of those *and* simplify the
> structure printer API.  If the structure printer is allowed to keep a
> little state, it can automatically insert separators.  With a little
> nesting state, it could even insert terminators by just saying "pop me
> to this level".  This could probably be better, but I'm imagining
> something like

I agree, however this is complicated by the fact that there are
additional restrictions on the actually printed code: newlines should be
placed at strategic locations to improve parsability, which could be
hard to decide in the printer alone without support from the logic that
drives it.

> struct sprinter *
> new_json_sprinter (const void *ctx, FILE *stream);
> struct sprinter *
> new_sexp_sprinter (const void *ctx, FILE *stream);
> 
> /* Start a map (a JSON object or a S-expression alist/plist/whatever)
>  * and return the nesting level of the map. */
> int
> sprinter_map (struct sprinter *sp);
> /* Start a list (aka array) and return the nesting level of the list. */
> int
> sprinter_list (struct sprinter *sp);
> 
> /* Close maps and lists until reaching level. */
> void
> sprinter_pop (struct sprinter *sp, int level);
> 
> void
> sprinter_map_key (struct sprinter *sp, const char *key);
> void
> sprinter_number (struct sprinter *sp, int val);
> void
> sprinter_string (struct sprinter *sp, const char *val);
> void
> sprinter_bool (struct sprinter *sp, notmuch_bool_t val);
> 
> and that's it.  This would also subsume your format_attribute_*
> helpers.
> 
> Unfortunately, it's a pain to pass things like a structure printer
> object around formatters (too bad notmuch isn't written in C++, eh?).
> I think it's better to address this than to structure around it.
> Probably the simplest thing to do is to make a struct for formatter
> state and pass that in to the callbacks.  You could also more
> completely emulate classes, but that would probably be overkill for
> this.

I believe this approach is similar to the one I've implemented (though
probably higher level, not so many details explicitly written into the
formatting code). I would suggest trying to get any sort of structured
formatters (whether more like your suggestions or like the thing I
implemented doesn't matter so much) into the main codebase, and then
refactoring the other parts to use it. I've thought about providing a
single patch to all of notmuch that accomplishes this, but I've deemed
it too large and complicated to be accepted, I thought limiting it to
notmuch-search would be a way to get it set up, so that it could be
expanded to the other parts later.

Thanks for the comments, I'll keep thinking about your design, a very
interesting idea!

Peter
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 274 bytes
Desc: not available
URL: 



[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Austin Clements
Quoth Peter Feigl on Jan 22 at 12:17 am:
> On Sat, 21 Jan 2012 17:04:07 -0500, Austin Clements  
> wrote:
> > Quoth Peter Feigl on Jan 21 at 10:16 pm:
> > I think this is a great idea and I'm a fan of having an S-expression
> > format, but I also think there's a much easier and more general way to
> > structure this.
> > 
> > In particular, I don't think you should hijack search_format, since
> > you'll wind up repeating most of this work for anything else that
> > needs to output structured data (notmuch show, at least).  Rather, I'd
> > suggest creating a general "structure printer" struct that isn't tied
> > to search.  You've essentially already done this, you just called it
> > search_format.  Then, leave the existing format callbacks in place,
> > just use this new API instead of lots of printf calls.
> 
> I'm sorry I haven't been more clear about this, the intention was all
> along to check whether this would be ok in notmuch-search, and if it got
> accepted there, to factor it out into a separate file and then use it in
> notmuch-show and notmuch-reply. There's nothing in the printer (except
> for the name of the struct) that ties it to search.

Great!  However, it seems counterproductive to put all of these
structures and functions into search only to then move them somewhere
else.  Wouldn't it make more sense to introduce the generic structure
printer in a first patch, then refactor the existing code to use it in
a second (or maybe more) patch in a series?

> > What about all of those annoying {tag,item}_{start,sep,end} fields?  I
> > think you can simultaneously get rid of those *and* simplify the
> > structure printer API.  If the structure printer is allowed to keep a
> > little state, it can automatically insert separators.  With a little
> > nesting state, it could even insert terminators by just saying "pop me
> > to this level".  This could probably be better, but I'm imagining
> > something like
> 
> I agree, however this is complicated by the fact that there are
> additional restrictions on the actually printed code: newlines should be
> placed at strategic locations to improve parsability, which could be
> hard to decide in the printer alone without support from the logic that
> drives it.

True, but that support is easy to add and, I would argue, the exact
implementation of framing *should* be up to the printer (though
obviously where to place the framing should be up to the caller).
Following the general structure of the API I was proposing, you could
add a function like

/* Print a framing break that is easy to detect in a parser. */
void
sprinter_break (struct sprinter *sp);

that for JSON and S-expressions could just print a newline.  Or, for
JSON, if we want separators to appear before the newline (which they
don't have to; we can choose our framing however we want), it could
simply record that a newline should be printed after the next
separator or terminator.

> > struct sprinter *
> > new_json_sprinter (const void *ctx, FILE *stream);
> > struct sprinter *
> > new_sexp_sprinter (const void *ctx, FILE *stream);
> > 
> > /* Start a map (a JSON object or a S-expression alist/plist/whatever)
> >  * and return the nesting level of the map. */
> > int
> > sprinter_map (struct sprinter *sp);
> > /* Start a list (aka array) and return the nesting level of the list. */
> > int
> > sprinter_list (struct sprinter *sp);
> > 
> > /* Close maps and lists until reaching level. */
> > void
> > sprinter_pop (struct sprinter *sp, int level);
> > 
> > void
> > sprinter_map_key (struct sprinter *sp, const char *key);
> > void
> > sprinter_number (struct sprinter *sp, int val);
> > void
> > sprinter_string (struct sprinter *sp, const char *val);
> > void
> > sprinter_bool (struct sprinter *sp, notmuch_bool_t val);
> > 
> > and that's it.  This would also subsume your format_attribute_*
> > helpers.
> > 
> > Unfortunately, it's a pain to pass things like a structure printer
> > object around formatters (too bad notmuch isn't written in C++, eh?).
> > I think it's better to address this than to structure around it.
> > Probably the simplest thing to do is to make a struct for formatter
> > state and pass that in to the callbacks.  You could also more
> > completely emulate classes, but that would probably be overkill for
> > this.
> 
> I believe this approach is similar to the one I've implemented (though
> probably higher level, not so many details explicitly written into the
> formatting code). I would suggest trying to get any sort of structured

*nods* It was definitely inspired by yours.  The complexity has to go
somewhere, and in the long run it seems it would be better to isolate
it in the printer than to deal with it in every caller.  My ulterior
motive was to maintain roughly the existing and extensible callback
structure while eliminating the need for the various start/sep/end
fields, which would have to differ between the formats and would
otherwise duplicate knowledge that should be 

[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Austin Clements
Quoth Peter Feigl on Jan 21 at 10:16 pm:
> The output routines have been rewritten so that logical structure
> (objects with key/value pairs, arrays, strings and numbers) are
> written instead of ad-hoc printfs. This allows for easier adaptation
> of other output formats, as only the routines that start/end an object
> etc. have to be rewritten. The logic is the same for all formats.
> The default text output is handled differently, special cases are
> inserted at the proper places, as it differs too much from the
> structured output.

I think this is a great idea and I'm a fan of having an S-expression
format, but I also think there's a much easier and more general way to
structure this.

In particular, I don't think you should hijack search_format, since
you'll wind up repeating most of this work for anything else that
needs to output structured data (notmuch show, at least).  Rather, I'd
suggest creating a general "structure printer" struct that isn't tied
to search.  You've essentially already done this, you just called it
search_format.  Then, leave the existing format callbacks in place,
just use this new API instead of lots of printf calls.

What about all of those annoying {tag,item}_{start,sep,end} fields?  I
think you can simultaneously get rid of those *and* simplify the
structure printer API.  If the structure printer is allowed to keep a
little state, it can automatically insert separators.  With a little
nesting state, it could even insert terminators by just saying "pop me
to this level".  This could probably be better, but I'm imagining
something like

struct sprinter *
new_json_sprinter (const void *ctx, FILE *stream);
struct sprinter *
new_sexp_sprinter (const void *ctx, FILE *stream);

/* Start a map (a JSON object or a S-expression alist/plist/whatever)
 * and return the nesting level of the map. */
int
sprinter_map (struct sprinter *sp);
/* Start a list (aka array) and return the nesting level of the list. */
int
sprinter_list (struct sprinter *sp);

/* Close maps and lists until reaching level. */
void
sprinter_pop (struct sprinter *sp, int level);

void
sprinter_map_key (struct sprinter *sp, const char *key);
void
sprinter_number (struct sprinter *sp, int val);
void
sprinter_string (struct sprinter *sp, const char *val);
void
sprinter_bool (struct sprinter *sp, notmuch_bool_t val);

and that's it.  This would also subsume your format_attribute_*
helpers.

Unfortunately, it's a pain to pass things like a structure printer
object around formatters (too bad notmuch isn't written in C++, eh?).
I think it's better to address this than to structure around it.
Probably the simplest thing to do is to make a struct for formatter
state and pass that in to the callbacks.  You could also more
completely emulate classes, but that would probably be overkill for
this.


[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Jameson Graef Rollins
On Sun, 22 Jan 2012 00:21:37 +0100, "Peter Feigl"  wrote:
> What kind of documentation should I include?

Update the man page to describe the new format and command line options.

> The test suite should work fine, *if* it compares EXPECTED and OUTPUT
> not character-by-character, but rather by pretty-printing both the
> expected and the actual outputs by some JSON pretty-printer (like python
> -mjson.tool). I can of course provide additional test-cases for
> --format=sexp.

I was referring specifically to new tests for the new output format.
The test suite changes should include only additions, since as you point
out, the internal restructuring shouldn't affect any existing tests.

> How should I proceed on this? Re-submit the patch with the sexp-support
> removed and only JSON updated?

I think you should primarily work on addressing Austin's issues
regarding the output formatters first, being careful to try to make more
small atomic patches.  Then once that's done make a new series of
patches, depending on the new formatter patches, that adds the new
output format.

As Austin points out, more smaller patches that are narrowly focused are
much easier to review, even if there ends up being more changes in the
end.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Jameson Graef Rollins
On Sat, 21 Jan 2012 22:16:08 +0100, Peter Feigl  wrote:
> The output routines have been rewritten so that logical structure
> (objects with key/value pairs, arrays, strings and numbers) are
> written instead of ad-hoc printfs. This allows for easier adaptation
> of other output formats, as only the routines that start/end an object
> etc. have to be rewritten. The logic is the same for all formats.
> The default text output is handled differently, special cases are
> inserted at the proper places, as it differs too much from the
> structured output.

Hi, Peter.  Thanks for the contribution.

There are a lot of changes in this patch so I think you need to think
about how you can break this up into multiple smaller and more atomic
patches.  In particular, the addition of the sexp output format needs to
definitely be in a separate patch from the restructuring of the output
formatting.  You also don't mention anywhere in the commit log that
you've added this new output format.  You'll also need to include
documentation and test suite updates.

Thanks.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



[PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Peter Feigl
The output routines have been rewritten so that logical structure
(objects with key/value pairs, arrays, strings and numbers) are
written instead of ad-hoc printfs. This allows for easier adaptation
of other output formats, as only the routines that start/end an object
etc. have to be rewritten. The logic is the same for all formats.
The default text output is handled differently, special cases are
inserted at the proper places, as it differs too much from the
structured output.
---
 notmuch-search.c |  493 ++
 1 files changed, 309 insertions(+), 184 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index 8867aab..bce44c2 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -29,88 +29,221 @@ typedef enum {
 } output_t;
 
 typedef struct search_format {
-const char *results_start;
-const char *item_start;
-void (*item_id) (const void *ctx,
-const char *item_type,
-const char *item_id);
-void (*thread_summary) (const void *ctx,
-   const char *thread_id,
-   const time_t date,
-   const int matched,
-   const int total,
-   const char *authors,
-   const char *subject);
-const char *tag_start;
-const char *tag;
-const char *tag_sep;
-const char *tag_end;
-const char *item_sep;
-const char *item_end;
-const char *results_end;
-const char *results_null;
+void (*start_object) (const void *ctx, FILE *stream);
+void (*end_object) (const void *ctx, FILE *stream);
+void (*start_attribute) (const void *ctx, FILE *stream);
+void (*attribute_key) (const void *ctx, FILE *stream, const char *key);
+void (*attribute_key_value_separator) (const void *ctx, FILE *stream);
+void (*end_attribute) (const void *ctx, FILE *stream);
+void (*attribute_separator) (const void *ctx, FILE *stream);
+void (*start_array) (const void *ctx, FILE *stream);
+void (*array_item_separator) (const void *ctx, FILE *stream);
+void (*end_array) (const void *ctx, FILE *stream);
+void (*number) (const void *ctx, FILE *stream, int number);
+void (*string) (const void *ctx, FILE *stream, const char *string);
+void (*boolean) (const void *ctx, FILE *stream, int boolean);
 } search_format_t;
 
-static void
-format_item_id_text (const void *ctx,
-const char *item_type,
-const char *item_id);
-
-static void
-format_thread_text (const void *ctx,
-   const char *thread_id,
-   const time_t date,
-   const int matched,
-   const int total,
-   const char *authors,
-   const char *subject);
+/* dummy format */
 static const search_format_t format_text = {
-,
-   ,
-   format_item_id_text,
-   format_thread_text,
-(,
-   %s,  ,
-   ), \n,
-   ,
-\n,
-,
-};
+0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
+
+void json_start_object(const void *ctx, FILE *stream);
+void json_end_object(const void *ctx, FILE *stream);
+void json_start_attribute(const void *ctx, FILE *stream);
+void json_attribute_key(const void *ctx, FILE *stream, const char *key);
+void json_attribute_key_value_separator(const void *ctx, FILE *stream);
+void json_end_attribute(const void *ctx, FILE *stream);
+void json_attribute_separator(const void *ctx, FILE *stream);
+void json_start_array(const void *ctx, FILE *stream);
+void json_array_item_separator(const void *ctx, FILE *stream);
+void json_end_array(const void *ctx, FILE *stream);
+void json_number(const void *ctx, FILE *stream, int number);
+void json_string(const void *ctx, FILE *stream, const char *string);
+void json_boolean(const void *ctx, FILE *stream, int boolean);
+
+
+void json_start_object(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs({, stream);
+}
+void json_end_object(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs(}\n, stream);
+}
+void json_start_attribute(const void *ctx, FILE *stream) {
+(void)ctx;
+(void)stream;
+}
+void json_attribute_key(const void *ctx, FILE *stream, const char *key) {
+(void)ctx;
+fprintf(stream, \%s\, key);
+}
+void json_attribute_key_value_separator(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs(: , stream);
+}
+void json_end_attribute(const void *ctx, FILE *stream) {
+(void)ctx;
+(void)stream;
+}
+void json_attribute_separator(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs(,\n, stream);
+}
+void json_start_array(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs([, stream);
+}
+void json_array_item_separator(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs(, , stream);
+}
+void json_end_array(const void *ctx, FILE *stream) {
+(void)ctx;
+fputs(], 

Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Austin Clements
Quoth Peter Feigl on Jan 21 at 10:16 pm:
 The output routines have been rewritten so that logical structure
 (objects with key/value pairs, arrays, strings and numbers) are
 written instead of ad-hoc printfs. This allows for easier adaptation
 of other output formats, as only the routines that start/end an object
 etc. have to be rewritten. The logic is the same for all formats.
 The default text output is handled differently, special cases are
 inserted at the proper places, as it differs too much from the
 structured output.

I think this is a great idea and I'm a fan of having an S-expression
format, but I also think there's a much easier and more general way to
structure this.

In particular, I don't think you should hijack search_format, since
you'll wind up repeating most of this work for anything else that
needs to output structured data (notmuch show, at least).  Rather, I'd
suggest creating a general structure printer struct that isn't tied
to search.  You've essentially already done this, you just called it
search_format.  Then, leave the existing format callbacks in place,
just use this new API instead of lots of printf calls.

What about all of those annoying {tag,item}_{start,sep,end} fields?  I
think you can simultaneously get rid of those *and* simplify the
structure printer API.  If the structure printer is allowed to keep a
little state, it can automatically insert separators.  With a little
nesting state, it could even insert terminators by just saying pop me
to this level.  This could probably be better, but I'm imagining
something like

struct sprinter *
new_json_sprinter (const void *ctx, FILE *stream);
struct sprinter *
new_sexp_sprinter (const void *ctx, FILE *stream);

/* Start a map (a JSON object or a S-expression alist/plist/whatever)
 * and return the nesting level of the map. */
int
sprinter_map (struct sprinter *sp);
/* Start a list (aka array) and return the nesting level of the list. */
int
sprinter_list (struct sprinter *sp);

/* Close maps and lists until reaching level. */
void
sprinter_pop (struct sprinter *sp, int level);

void
sprinter_map_key (struct sprinter *sp, const char *key);
void
sprinter_number (struct sprinter *sp, int val);
void
sprinter_string (struct sprinter *sp, const char *val);
void
sprinter_bool (struct sprinter *sp, notmuch_bool_t val);

and that's it.  This would also subsume your format_attribute_*
helpers.

Unfortunately, it's a pain to pass things like a structure printer
object around formatters (too bad notmuch isn't written in C++, eh?).
I think it's better to address this than to structure around it.
Probably the simplest thing to do is to make a struct for formatter
state and pass that in to the callbacks.  You could also more
completely emulate classes, but that would probably be overkill for
this.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Jameson Graef Rollins
On Sat, 21 Jan 2012 22:16:08 +0100, Peter Feigl cra...@gmx.net wrote:
 The output routines have been rewritten so that logical structure
 (objects with key/value pairs, arrays, strings and numbers) are
 written instead of ad-hoc printfs. This allows for easier adaptation
 of other output formats, as only the routines that start/end an object
 etc. have to be rewritten. The logic is the same for all formats.
 The default text output is handled differently, special cases are
 inserted at the proper places, as it differs too much from the
 structured output.

Hi, Peter.  Thanks for the contribution.

There are a lot of changes in this patch so I think you need to think
about how you can break this up into multiple smaller and more atomic
patches.  In particular, the addition of the sexp output format needs to
definitely be in a separate patch from the restructuring of the output
formatting.  You also don't mention anywhere in the commit log that
you've added this new output format.  You'll also need to include
documentation and test suite updates.

Thanks.

jamie.


pgpTlthtAuwO7.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Peter Feigl
On Sat, 21 Jan 2012 17:04:07 -0500, Austin Clements amdra...@mit.edu wrote:
 Quoth Peter Feigl on Jan 21 at 10:16 pm:
 I think this is a great idea and I'm a fan of having an S-expression
 format, but I also think there's a much easier and more general way to
 structure this.
 
 In particular, I don't think you should hijack search_format, since
 you'll wind up repeating most of this work for anything else that
 needs to output structured data (notmuch show, at least).  Rather, I'd
 suggest creating a general structure printer struct that isn't tied
 to search.  You've essentially already done this, you just called it
 search_format.  Then, leave the existing format callbacks in place,
 just use this new API instead of lots of printf calls.

I'm sorry I haven't been more clear about this, the intention was all
along to check whether this would be ok in notmuch-search, and if it got
accepted there, to factor it out into a separate file and then use it in
notmuch-show and notmuch-reply. There's nothing in the printer (except
for the name of the struct) that ties it to search.

 What about all of those annoying {tag,item}_{start,sep,end} fields?  I
 think you can simultaneously get rid of those *and* simplify the
 structure printer API.  If the structure printer is allowed to keep a
 little state, it can automatically insert separators.  With a little
 nesting state, it could even insert terminators by just saying pop me
 to this level.  This could probably be better, but I'm imagining
 something like

I agree, however this is complicated by the fact that there are
additional restrictions on the actually printed code: newlines should be
placed at strategic locations to improve parsability, which could be
hard to decide in the printer alone without support from the logic that
drives it.

 struct sprinter *
 new_json_sprinter (const void *ctx, FILE *stream);
 struct sprinter *
 new_sexp_sprinter (const void *ctx, FILE *stream);
 
 /* Start a map (a JSON object or a S-expression alist/plist/whatever)
  * and return the nesting level of the map. */
 int
 sprinter_map (struct sprinter *sp);
 /* Start a list (aka array) and return the nesting level of the list. */
 int
 sprinter_list (struct sprinter *sp);
 
 /* Close maps and lists until reaching level. */
 void
 sprinter_pop (struct sprinter *sp, int level);
 
 void
 sprinter_map_key (struct sprinter *sp, const char *key);
 void
 sprinter_number (struct sprinter *sp, int val);
 void
 sprinter_string (struct sprinter *sp, const char *val);
 void
 sprinter_bool (struct sprinter *sp, notmuch_bool_t val);
 
 and that's it.  This would also subsume your format_attribute_*
 helpers.
 
 Unfortunately, it's a pain to pass things like a structure printer
 object around formatters (too bad notmuch isn't written in C++, eh?).
 I think it's better to address this than to structure around it.
 Probably the simplest thing to do is to make a struct for formatter
 state and pass that in to the callbacks.  You could also more
 completely emulate classes, but that would probably be overkill for
 this.

I believe this approach is similar to the one I've implemented (though
probably higher level, not so many details explicitly written into the
formatting code). I would suggest trying to get any sort of structured
formatters (whether more like your suggestions or like the thing I
implemented doesn't matter so much) into the main codebase, and then
refactoring the other parts to use it. I've thought about providing a
single patch to all of notmuch that accomplishes this, but I've deemed
it too large and complicated to be accepted, I thought limiting it to
notmuch-search would be a way to get it set up, so that it could be
expanded to the other parts later.

Thanks for the comments, I'll keep thinking about your design, a very
interesting idea!

Peter


pgpx5RmL8R6Lq.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Peter Feigl
On Sat, 21 Jan 2012 15:12:57 -0800, Jameson Graef Rollins 
jroll...@finestructure.net wrote:
 On Sat, 21 Jan 2012 22:16:08 +0100, Peter Feigl cra...@gmx.net wrote:
  The output routines have been rewritten so that logical structure
  (objects with key/value pairs, arrays, strings and numbers) are
  written instead of ad-hoc printfs. This allows for easier adaptation
  of other output formats, as only the routines that start/end an object
  etc. have to be rewritten. The logic is the same for all formats.
  The default text output is handled differently, special cases are
  inserted at the proper places, as it differs too much from the
  structured output.
 
 Hi, Peter.  Thanks for the contribution.
 
 There are a lot of changes in this patch so I think you need to think
 about how you can break this up into multiple smaller and more atomic
 patches.  In particular, the addition of the sexp output format needs to
 definitely be in a separate patch from the restructuring of the output
 formatting.  You also don't mention anywhere in the commit log that
 you've added this new output format.  You'll also need to include
 documentation and test suite updates.

I'm sorry I forgot to mention that, it was mainly meant as a way to show
that this is easily possible (i.e. that the formatting is decoupled from
the logic, so that additional and different formats can be added without
influencing the printing logic). It'd be easy to split this up. What
kind of documentation should I include? 
The test suite should work fine, *if* it compares EXPECTED and OUTPUT
not character-by-character, but rather by pretty-printing both the
expected and the actual outputs by some JSON pretty-printer (like python
-mjson.tool). I can of course provide additional test-cases for
--format=sexp.

How should I proceed on this? Re-submit the patch with the sexp-support
removed and only JSON updated?

Thanks!

Peter
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Austin Clements
Quoth Peter Feigl on Jan 22 at 12:17 am:
 On Sat, 21 Jan 2012 17:04:07 -0500, Austin Clements amdra...@mit.edu wrote:
  Quoth Peter Feigl on Jan 21 at 10:16 pm:
  I think this is a great idea and I'm a fan of having an S-expression
  format, but I also think there's a much easier and more general way to
  structure this.
  
  In particular, I don't think you should hijack search_format, since
  you'll wind up repeating most of this work for anything else that
  needs to output structured data (notmuch show, at least).  Rather, I'd
  suggest creating a general structure printer struct that isn't tied
  to search.  You've essentially already done this, you just called it
  search_format.  Then, leave the existing format callbacks in place,
  just use this new API instead of lots of printf calls.
 
 I'm sorry I haven't been more clear about this, the intention was all
 along to check whether this would be ok in notmuch-search, and if it got
 accepted there, to factor it out into a separate file and then use it in
 notmuch-show and notmuch-reply. There's nothing in the printer (except
 for the name of the struct) that ties it to search.

Great!  However, it seems counterproductive to put all of these
structures and functions into search only to then move them somewhere
else.  Wouldn't it make more sense to introduce the generic structure
printer in a first patch, then refactor the existing code to use it in
a second (or maybe more) patch in a series?

  What about all of those annoying {tag,item}_{start,sep,end} fields?  I
  think you can simultaneously get rid of those *and* simplify the
  structure printer API.  If the structure printer is allowed to keep a
  little state, it can automatically insert separators.  With a little
  nesting state, it could even insert terminators by just saying pop me
  to this level.  This could probably be better, but I'm imagining
  something like
 
 I agree, however this is complicated by the fact that there are
 additional restrictions on the actually printed code: newlines should be
 placed at strategic locations to improve parsability, which could be
 hard to decide in the printer alone without support from the logic that
 drives it.

True, but that support is easy to add and, I would argue, the exact
implementation of framing *should* be up to the printer (though
obviously where to place the framing should be up to the caller).
Following the general structure of the API I was proposing, you could
add a function like

/* Print a framing break that is easy to detect in a parser. */
void
sprinter_break (struct sprinter *sp);

that for JSON and S-expressions could just print a newline.  Or, for
JSON, if we want separators to appear before the newline (which they
don't have to; we can choose our framing however we want), it could
simply record that a newline should be printed after the next
separator or terminator.

  struct sprinter *
  new_json_sprinter (const void *ctx, FILE *stream);
  struct sprinter *
  new_sexp_sprinter (const void *ctx, FILE *stream);
  
  /* Start a map (a JSON object or a S-expression alist/plist/whatever)
   * and return the nesting level of the map. */
  int
  sprinter_map (struct sprinter *sp);
  /* Start a list (aka array) and return the nesting level of the list. */
  int
  sprinter_list (struct sprinter *sp);
  
  /* Close maps and lists until reaching level. */
  void
  sprinter_pop (struct sprinter *sp, int level);
  
  void
  sprinter_map_key (struct sprinter *sp, const char *key);
  void
  sprinter_number (struct sprinter *sp, int val);
  void
  sprinter_string (struct sprinter *sp, const char *val);
  void
  sprinter_bool (struct sprinter *sp, notmuch_bool_t val);
  
  and that's it.  This would also subsume your format_attribute_*
  helpers.
  
  Unfortunately, it's a pain to pass things like a structure printer
  object around formatters (too bad notmuch isn't written in C++, eh?).
  I think it's better to address this than to structure around it.
  Probably the simplest thing to do is to make a struct for formatter
  state and pass that in to the callbacks.  You could also more
  completely emulate classes, but that would probably be overkill for
  this.
 
 I believe this approach is similar to the one I've implemented (though
 probably higher level, not so many details explicitly written into the
 formatting code). I would suggest trying to get any sort of structured

*nods* It was definitely inspired by yours.  The complexity has to go
somewhere, and in the long run it seems it would be better to isolate
it in the printer than to deal with it in every caller.  My ulterior
motive was to maintain roughly the existing and extensible callback
structure while eliminating the need for the various start/sep/end
fields, which would have to differ between the formats and would
otherwise duplicate knowledge that should be isolated to a structure
printer.

 formatters (whether more like your suggestions or like the thing I
 implemented doesn't 

Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier

2012-01-21 Thread Jameson Graef Rollins
On Sun, 22 Jan 2012 00:21:37 +0100, Peter Feigl cra...@gmx.net wrote:
 What kind of documentation should I include?

Update the man page to describe the new format and command line options.

 The test suite should work fine, *if* it compares EXPECTED and OUTPUT
 not character-by-character, but rather by pretty-printing both the
 expected and the actual outputs by some JSON pretty-printer (like python
 -mjson.tool). I can of course provide additional test-cases for
 --format=sexp.

I was referring specifically to new tests for the new output format.
The test suite changes should include only additions, since as you point
out, the internal restructuring shouldn't affect any existing tests.

 How should I proceed on this? Re-submit the patch with the sexp-support
 removed and only JSON updated?

I think you should primarily work on addressing Austin's issues
regarding the output formatters first, being careful to try to make more
small atomic patches.  Then once that's done make a new series of
patches, depending on the new formatter patches, that adds the new
output format.

As Austin points out, more smaller patches that are narrowly focused are
much easier to review, even if there ends up being more changes in the
end.

jamie.


pgpZxp92jyVNV.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch