[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Mon, 02 Jul 2012, Austin Clements wrote: > Quoth myself on Jul 01 at 8:12 pm: >> Quoth Tomi Ollila on Jul 02 at 1:13 am: >> > On Sat, Jun 30 2012, Mark Walters wrote: >> > >> > > Add newlines between complete threads to make asynchronous parsing >> > > of the JSON easier. >> > > --- >> > > >> > > notmuch-pick uses the JSON output of notmuch show but, in many cases, >> > > for many threads. This can take quite a long time when displaying a >> > > large number of messages (say 20 seconds for the 10,000 messages in >> > > the notmuch archive). Thus it is desirable to display results >> > > incrementally in the same way that search currently does. >> > > >> > > To make this easier this patch adds newlines between each toplevel >> > > thread. So the ouput becomes >> > > >> > > [ >> > > thread1 >> > > , thread2 >> > > , thread3 >> > > ... >> > > , last_thread >> > > ] >> > > >> > > Thus the parser can easily tell if it has enough data to do some more >> > > parsing. >> > > >> > > Obviously, this changes the JSON output. This should not break any >> > > consumer as the JSON parsers should not mind. However, it does break >> > > several tests. Obviously, I will fix these but I wanted to check if >> > > people were basically happy with the change first. >> > >> > To provide this feature rather than relying on newlines the parser should >> > use it's state to notice when one thread ends. >> > >> > Such a change could be used (privately) for human consumption -- allowing >> > free change of whitespace during inspection (in a debugging session or so). >> > Computer software should not rely (or suffer) from any additional >> > (or lack thereof) whitespace there is... >> > >> > ... or at least a really convicing argument for the chance needs to >> > be presented (before "restricting" the json output notmuch spits out). >> >> Given a JSON parser that only knows how to parse complete JSON >> expressions, it's potentially very inefficient to keep attempting to >> parse something when you don't know if it's complete. The newlines >> provide an in-band framing so the consumer knows when there's a >> complete object to be parsed. >> >> In effect, this defines a super-protocol of JSON that's compatible >> with standard JSON, but easy to incrementally parse. >> >> That said, just this weekend I implemented JSON-based search with >> incremental JSON parsing and I took a slightly different approach. I >> still put framing into the newlines of the search results, but rather >> than rely on it for correctness, the consumer uses it as an >> optimization that only hints that a complete JSON expression is >> probably available. If the expression turns out to be incomplete, >> that's okay. >> >> I considered building a fully-incremental JSON parser that never >> backtracks by more than a token, which would eliminate even the cost >> of reparsing, but if we do move to S-expressions (which I think we >> should), we want to let Emacs' C implementation do as much of the >> parsing as possible, and the only thing we can do with that is read a >> complete expression. > > Actually, I take that back. While we can't do fast incremental > S-expression parsing, `parse-partial-sexp' can tell us incrementally > (and probably very quickly) *if* there's a complete expression ready > to parse, so we could avoid calling into the parser at all unless it > would succeed. > > I'll try this out in my incremental JSON parser and see how well it > works. I have converted pick to use Austin's incremental parser and all works well so this seems the way to go. Hence I have marked my original patch obsolete. Best wishes Mark
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Mon, 02 Jul 2012, Austin Clements wrote: > Quoth myself on Jul 01 at 8:12 pm: >> Quoth Tomi Ollila on Jul 02 at 1:13 am: >> > On Sat, Jun 30 2012, Mark Walters wrote: >> > >> > > Add newlines between complete threads to make asynchronous parsing >> > > of the JSON easier. >> > > --- >> > > >> > > notmuch-pick uses the JSON output of notmuch show but, in many cases, >> > > for many threads. This can take quite a long time when displaying a >> > > large number of messages (say 20 seconds for the 10,000 messages in >> > > the notmuch archive). Thus it is desirable to display results >> > > incrementally in the same way that search currently does. >> > > >> > > To make this easier this patch adds newlines between each toplevel >> > > thread. So the ouput becomes >> > > >> > > [ >> > > thread1 >> > > , thread2 >> > > , thread3 >> > > ... >> > > , last_thread >> > > ] >> > > >> > > Thus the parser can easily tell if it has enough data to do some more >> > > parsing. >> > > >> > > Obviously, this changes the JSON output. This should not break any >> > > consumer as the JSON parsers should not mind. However, it does break >> > > several tests. Obviously, I will fix these but I wanted to check if >> > > people were basically happy with the change first. >> > >> > To provide this feature rather than relying on newlines the parser should >> > use it's state to notice when one thread ends. >> > >> > Such a change could be used (privately) for human consumption -- allowing >> > free change of whitespace during inspection (in a debugging session or so). >> > Computer software should not rely (or suffer) from any additional >> > (or lack thereof) whitespace there is... >> > >> > ... or at least a really convicing argument for the chance needs to >> > be presented (before "restricting" the json output notmuch spits out). >> >> Given a JSON parser that only knows how to parse complete JSON >> expressions, it's potentially very inefficient to keep attempting to >> parse something when you don't know if it's complete. The newlines >> provide an in-band framing so the consumer knows when there's a >> complete object to be parsed. >> >> In effect, this defines a super-protocol of JSON that's compatible >> with standard JSON, but easy to incrementally parse. >> >> That said, just this weekend I implemented JSON-based search with >> incremental JSON parsing and I took a slightly different approach. I >> still put framing into the newlines of the search results, but rather >> than rely on it for correctness, the consumer uses it as an >> optimization that only hints that a complete JSON expression is >> probably available. If the expression turns out to be incomplete, >> that's okay. >> >> I considered building a fully-incremental JSON parser that never >> backtracks by more than a token, which would eliminate even the cost >> of reparsing, but if we do move to S-expressions (which I think we >> should), we want to let Emacs' C implementation do as much of the >> parsing as possible, and the only thing we can do with that is read a >> complete expression. > > Actually, I take that back. While we can't do fast incremental > S-expression parsing, `parse-partial-sexp' can tell us incrementally > (and probably very quickly) *if* there's a complete expression ready > to parse, so we could avoid calling into the parser at all unless it > would succeed. > > I'll try this out in my incremental JSON parser and see how well it > works. I have converted pick to use Austin's incremental parser and all works well so this seems the way to go. Hence I have marked my original patch obsolete. Best wishes Mark ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Mon, Jul 02 2012, Austin Clements wrote: > Quoth myself on Jul 01 at 8:12 pm: >> Quoth Tomi Ollila on Jul 02 at 1:13 am: >> > On Sat, Jun 30 2012, Mark Walters wrote: >> > >> > > Add newlines between complete threads to make asynchronous parsing >> > > of the JSON easier. >> > > --- >> > > >> > > notmuch-pick uses the JSON output of notmuch show but, in many cases, >> > > for many threads. This can take quite a long time when displaying a >> > > large number of messages (say 20 seconds for the 10,000 messages in >> > > the notmuch archive). Thus it is desirable to display results >> > > incrementally in the same way that search currently does. >> > > >> > > To make this easier this patch adds newlines between each toplevel >> > > thread. So the ouput becomes >> > > >> > > [ >> > > thread1 >> > > , thread2 >> > > , thread3 >> > > ... >> > > , last_thread >> > > ] >> > > >> > > Thus the parser can easily tell if it has enough data to do some more >> > > parsing. >> > > >> > > Obviously, this changes the JSON output. This should not break any >> > > consumer as the JSON parsers should not mind. However, it does break >> > > several tests. Obviously, I will fix these but I wanted to check if >> > > people were basically happy with the change first. >> > >> > To provide this feature rather than relying on newlines the parser should >> > use it's state to notice when one thread ends. >> > >> > Such a change could be used (privately) for human consumption -- allowing >> > free change of whitespace during inspection (in a debugging session or so). >> > Computer software should not rely (or suffer) from any additional >> > (or lack thereof) whitespace there is... >> > >> > ... or at least a really convicing argument for the chance needs to >> > be presented (before "restricting" the json output notmuch spits out). >> >> Given a JSON parser that only knows how to parse complete JSON >> expressions, it's potentially very inefficient to keep attempting to >> parse something when you don't know if it's complete. The newlines >> provide an in-band framing so the consumer knows when there's a >> complete object to be parsed. >> >> In effect, this defines a super-protocol of JSON that's compatible >> with standard JSON, but easy to incrementally parse. >> >> That said, just this weekend I implemented JSON-based search with >> incremental JSON parsing and I took a slightly different approach. I >> still put framing into the newlines of the search results, but rather >> than rely on it for correctness, the consumer uses it as an >> optimization that only hints that a complete JSON expression is >> probably available. If the expression turns out to be incomplete, >> that's okay. >> >> I considered building a fully-incremental JSON parser that never >> backtracks by more than a token, which would eliminate even the cost >> of reparsing, but if we do move to S-expressions (which I think we >> should), we want to let Emacs' C implementation do as much of the >> parsing as possible, and the only thing we can do with that is read a >> complete expression. > > Actually, I take that back. While we can't do fast incremental > S-expression parsing, `parse-partial-sexp' can tell us incrementally > (and probably very quickly) *if* there's a complete expression ready > to parse, so we could avoid calling into the parser at all unless it > would succeed. > > I'll try this out in my incremental JSON parser and see how well it > works. I played a bit with parse-partial-sexp (and sexp-at-point) and it looks like this really could work. IMO the things to be done could be 1) add --format=sexp support to notmuch cli 2) add tests for that (I can do some tests) 3) review those patches (I will definitely be one reviewer) 4) convert emacs to use --format=sexp everywhere it now uses --format=json, (is it then basically s/(json-read)/(sexp-at-point)/ ?) and adjust tests. 5) review those patches (I will definitely be one reviewer) 6) add support to commands lines like 'notmuch search --output=sexp --sort=oldest-first tag:unread' ... (even I can do that) (of course --output=json would work too as expected) 7) review... 8) convert emacs notmuch search to use that syntax and incrementally display progress. 9) review... This means that we would drop adding new features using json output in emacs mua and concentrate using sexps wherever applicable. (during this we should be able to determine whether those framing newlines are needed or not) Tomi PS: A week ago I also did some experiments how notmuch cli could spit sexp output using current json formatters -- just naiively copying those and modifying would result like 99% of copy-paste and 1% of changes. I got some initial thoughts how thihgs could be done but luckily Peter & Austin are ahead of me -- I now just eagerly wait for patches to be reviewed :D
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Mon, Jul 02 2012, Austin Clements wrote: > Quoth myself on Jul 01 at 8:12 pm: >> Quoth Tomi Ollila on Jul 02 at 1:13 am: >> > On Sat, Jun 30 2012, Mark Walters wrote: >> > >> > > Add newlines between complete threads to make asynchronous parsing >> > > of the JSON easier. >> > > --- >> > > >> > > notmuch-pick uses the JSON output of notmuch show but, in many cases, >> > > for many threads. This can take quite a long time when displaying a >> > > large number of messages (say 20 seconds for the 10,000 messages in >> > > the notmuch archive). Thus it is desirable to display results >> > > incrementally in the same way that search currently does. >> > > >> > > To make this easier this patch adds newlines between each toplevel >> > > thread. So the ouput becomes >> > > >> > > [ >> > > thread1 >> > > , thread2 >> > > , thread3 >> > > ... >> > > , last_thread >> > > ] >> > > >> > > Thus the parser can easily tell if it has enough data to do some more >> > > parsing. >> > > >> > > Obviously, this changes the JSON output. This should not break any >> > > consumer as the JSON parsers should not mind. However, it does break >> > > several tests. Obviously, I will fix these but I wanted to check if >> > > people were basically happy with the change first. >> > >> > To provide this feature rather than relying on newlines the parser should >> > use it's state to notice when one thread ends. >> > >> > Such a change could be used (privately) for human consumption -- allowing >> > free change of whitespace during inspection (in a debugging session or so). >> > Computer software should not rely (or suffer) from any additional >> > (or lack thereof) whitespace there is... >> > >> > ... or at least a really convicing argument for the chance needs to >> > be presented (before "restricting" the json output notmuch spits out). >> >> Given a JSON parser that only knows how to parse complete JSON >> expressions, it's potentially very inefficient to keep attempting to >> parse something when you don't know if it's complete. The newlines >> provide an in-band framing so the consumer knows when there's a >> complete object to be parsed. >> >> In effect, this defines a super-protocol of JSON that's compatible >> with standard JSON, but easy to incrementally parse. >> >> That said, just this weekend I implemented JSON-based search with >> incremental JSON parsing and I took a slightly different approach. I >> still put framing into the newlines of the search results, but rather >> than rely on it for correctness, the consumer uses it as an >> optimization that only hints that a complete JSON expression is >> probably available. If the expression turns out to be incomplete, >> that's okay. >> >> I considered building a fully-incremental JSON parser that never >> backtracks by more than a token, which would eliminate even the cost >> of reparsing, but if we do move to S-expressions (which I think we >> should), we want to let Emacs' C implementation do as much of the >> parsing as possible, and the only thing we can do with that is read a >> complete expression. > > Actually, I take that back. While we can't do fast incremental > S-expression parsing, `parse-partial-sexp' can tell us incrementally > (and probably very quickly) *if* there's a complete expression ready > to parse, so we could avoid calling into the parser at all unless it > would succeed. > > I'll try this out in my incremental JSON parser and see how well it > works. I played a bit with parse-partial-sexp (and sexp-at-point) and it looks like this really could work. IMO the things to be done could be 1) add --format=sexp support to notmuch cli 2) add tests for that (I can do some tests) 3) review those patches (I will definitely be one reviewer) 4) convert emacs to use --format=sexp everywhere it now uses --format=json, (is it then basically s/(json-read)/(sexp-at-point)/ ?) and adjust tests. 5) review those patches (I will definitely be one reviewer) 6) add support to commands lines like 'notmuch search --output=sexp --sort=oldest-first tag:unread' ... (even I can do that) (of course --output=json would work too as expected) 7) review... 8) convert emacs notmuch search to use that syntax and incrementally display progress. 9) review... This means that we would drop adding new features using json output in emacs mua and concentrate using sexps wherever applicable. (during this we should be able to determine whether those framing newlines are needed or not) Tomi PS: A week ago I also did some experiments how notmuch cli could spit sexp output using current json formatters -- just naiively copying those and modifying would result like 99% of copy-paste and 1% of changes. I got some initial thoughts how thihgs could be done but luckily Peter & Austin are ahead of me -- I now just eagerly wait for patches to be reviewed :D ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Sat, Jun 30 2012, Mark Walters wrote: > Add newlines between complete threads to make asynchronous parsing > of the JSON easier. > --- > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > for many threads. This can take quite a long time when displaying a > large number of messages (say 20 seconds for the 10,000 messages in > the notmuch archive). Thus it is desirable to display results > incrementally in the same way that search currently does. > > To make this easier this patch adds newlines between each toplevel > thread. So the ouput becomes > > [ > thread1 > , thread2 > , thread3 > ... > , last_thread > ] > > Thus the parser can easily tell if it has enough data to do some more > parsing. > > Obviously, this changes the JSON output. This should not break any > consumer as the JSON parsers should not mind. However, it does break > several tests. Obviously, I will fix these but I wanted to check if > people were basically happy with the change first. To provide this feature rather than relying on newlines the parser should use it's state to notice when one thread ends. Such a change could be used (privately) for human consumption -- allowing free change of whitespace during inspection (in a debugging session or so). Computer software should not rely (or suffer) from any additional (or lack thereof) whitespace there is... ... or at least a really convicing argument for the chance needs to be presented (before "restricting" the json output notmuch spits out). Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, including newlines outside strings). So I quess notmuch-pick uses something slightly different (probably using json.el subroutines).. Btw2: I'm very interested to see notmuch-pick in action -- I just don't see this a way to do this particular support properly. Btw3: is search is ever going to use json we'll face the same problem -- unless writing each line as a separate json object (and starting to use s-expressions for speed) > Also, should devel/schemata be updated? It seems a little unclear as > this is not really a "JSON" change as the JSON does not care about the > newlines. > > Best wishes and best luck with your notmuch-pick work. > > Mark Tomi > > > notmuch-show.c |5 + > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/notmuch-show.c b/notmuch-show.c > index 195e318..4a1d699 100644 > --- a/notmuch-show.c > +++ b/notmuch-show.c > @@ -942,6 +942,8 @@ do_show (void *ctx, > > if (format->message_set_start) > fputs (format->message_set_start, stdout); > +if (format == &format_json) > + fputs ("\n", stdout); > > for (threads = notmuch_query_search_threads (query); >notmuch_threads_valid (threads); > @@ -963,6 +965,9 @@ do_show (void *ctx, > if (status && !res) > res = status; > > + if (format == &format_json) > + fputs ("\n", stdout); > + > notmuch_thread_destroy (thread); > > } > -- > 1.7.9.1 > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Quoth myself on Jul 01 at 8:12 pm: > Quoth Tomi Ollila on Jul 02 at 1:13 am: > > On Sat, Jun 30 2012, Mark Walters wrote: > > > > > Add newlines between complete threads to make asynchronous parsing > > > of the JSON easier. > > > --- > > > > > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > > > for many threads. This can take quite a long time when displaying a > > > large number of messages (say 20 seconds for the 10,000 messages in > > > the notmuch archive). Thus it is desirable to display results > > > incrementally in the same way that search currently does. > > > > > > To make this easier this patch adds newlines between each toplevel > > > thread. So the ouput becomes > > > > > > [ > > > thread1 > > > , thread2 > > > , thread3 > > > ... > > > , last_thread > > > ] > > > > > > Thus the parser can easily tell if it has enough data to do some more > > > parsing. > > > > > > Obviously, this changes the JSON output. This should not break any > > > consumer as the JSON parsers should not mind. However, it does break > > > several tests. Obviously, I will fix these but I wanted to check if > > > people were basically happy with the change first. > > > > To provide this feature rather than relying on newlines the parser should > > use it's state to notice when one thread ends. > > > > Such a change could be used (privately) for human consumption -- allowing > > free change of whitespace during inspection (in a debugging session or so). > > Computer software should not rely (or suffer) from any additional > > (or lack thereof) whitespace there is... > > > > ... or at least a really convicing argument for the chance needs to > > be presented (before "restricting" the json output notmuch spits out). > > Given a JSON parser that only knows how to parse complete JSON > expressions, it's potentially very inefficient to keep attempting to > parse something when you don't know if it's complete. The newlines > provide an in-band framing so the consumer knows when there's a > complete object to be parsed. > > In effect, this defines a super-protocol of JSON that's compatible > with standard JSON, but easy to incrementally parse. > > That said, just this weekend I implemented JSON-based search with > incremental JSON parsing and I took a slightly different approach. I > still put framing into the newlines of the search results, but rather > than rely on it for correctness, the consumer uses it as an > optimization that only hints that a complete JSON expression is > probably available. If the expression turns out to be incomplete, > that's okay. > > I considered building a fully-incremental JSON parser that never > backtracks by more than a token, which would eliminate even the cost > of reparsing, but if we do move to S-expressions (which I think we > should), we want to let Emacs' C implementation do as much of the > parsing as possible, and the only thing we can do with that is read a > complete expression. Actually, I take that back. While we can't do fast incremental S-expression parsing, `parse-partial-sexp' can tell us incrementally (and probably very quickly) *if* there's a complete expression ready to parse, so we could avoid calling into the parser at all unless it would succeed. I'll try this out in my incremental JSON parser and see how well it works.
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Sun, 01 Jul 2012, Tomi Ollila wrote: > On Sat, Jun 30 2012, Mark Walters wrote: > >> Add newlines between complete threads to make asynchronous parsing >> of the JSON easier. >> --- >> >> notmuch-pick uses the JSON output of notmuch show but, in many cases, >> for many threads. This can take quite a long time when displaying a >> large number of messages (say 20 seconds for the 10,000 messages in >> the notmuch archive). Thus it is desirable to display results >> incrementally in the same way that search currently does. >> >> To make this easier this patch adds newlines between each toplevel >> thread. So the ouput becomes >> >> [ >> thread1 >> , thread2 >> , thread3 >> ... >> , last_thread >> ] >> >> Thus the parser can easily tell if it has enough data to do some more >> parsing. >> >> Obviously, this changes the JSON output. This should not break any >> consumer as the JSON parsers should not mind. However, it does break >> several tests. Obviously, I will fix these but I wanted to check if >> people were basically happy with the change first. > > To provide this feature rather than relying on newlines the parser should > use it's state to notice when one thread ends. > > Such a change could be used (privately) for human consumption -- allowing > free change of whitespace during inspection (in a debugging session or so). > Computer software should not rely (or suffer) from any additional > (or lack thereof) whitespace there is... > > ... or at least a really convicing argument for the chance needs to > be presented (before "restricting" the json output notmuch spits out). > > Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, > including newlines outside strings). So I quess notmuch-pick uses something > slightly different (probably using json.el subroutines).. I was following Austin's suggestion (on irc and id:"20120214152114.GQ27039 at mit.edu"). The idea is that each thread in the JSON output is an entire JSON object. Thus pick skips the first [ and the waits until there are two \n's in the incoming stream. Then it knows that the complete first thread has been received and it parses that with json-read as normal. The important thing is that it is trivial to tell when a complete (and so parsable) JSON object has arrived. It seems to work, but I am definitely open to other approaches. > Btw2: I'm very interested to see notmuch-pick in action -- I just don't > see this a way to do this particular support properly. > > Btw3: is search is ever going to use json we'll face the same problem -- > unless writing each line as a separate json object (and starting to use > s-expressions for speed) > >> Also, should devel/schemata be updated? It seems a little unclear as >> this is not really a "JSON" change as the JSON does not care about the >> newlines. >> >> Best wishes > and best luck with your notmuch-pick work. Thanks! Mark >> >> notmuch-show.c |5 + >> 1 files changed, 5 insertions(+), 0 deletions(-) >> >> diff --git a/notmuch-show.c b/notmuch-show.c >> index 195e318..4a1d699 100644 >> --- a/notmuch-show.c >> +++ b/notmuch-show.c >> @@ -942,6 +942,8 @@ do_show (void *ctx, >> >> if (format->message_set_start) >> fputs (format->message_set_start, stdout); >> +if (format == &format_json) >> +fputs ("\n", stdout); >> >> for (threads = notmuch_query_search_threads (query); >> notmuch_threads_valid (threads); >> @@ -963,6 +965,9 @@ do_show (void *ctx, >> if (status && !res) >> res = status; >> >> +if (format == &format_json) >> +fputs ("\n", stdout); >> + >> notmuch_thread_destroy (thread); >> >> } >> -- >> 1.7.9.1 >> >> ___ >> notmuch mailing list >> notmuch at notmuchmail.org >> http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Quoth myself on Jul 01 at 8:12 pm: > Quoth Tomi Ollila on Jul 02 at 1:13 am: > > On Sat, Jun 30 2012, Mark Walters wrote: > > > > > Add newlines between complete threads to make asynchronous parsing > > > of the JSON easier. > > > --- > > > > > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > > > for many threads. This can take quite a long time when displaying a > > > large number of messages (say 20 seconds for the 10,000 messages in > > > the notmuch archive). Thus it is desirable to display results > > > incrementally in the same way that search currently does. > > > > > > To make this easier this patch adds newlines between each toplevel > > > thread. So the ouput becomes > > > > > > [ > > > thread1 > > > , thread2 > > > , thread3 > > > ... > > > , last_thread > > > ] > > > > > > Thus the parser can easily tell if it has enough data to do some more > > > parsing. > > > > > > Obviously, this changes the JSON output. This should not break any > > > consumer as the JSON parsers should not mind. However, it does break > > > several tests. Obviously, I will fix these but I wanted to check if > > > people were basically happy with the change first. > > > > To provide this feature rather than relying on newlines the parser should > > use it's state to notice when one thread ends. > > > > Such a change could be used (privately) for human consumption -- allowing > > free change of whitespace during inspection (in a debugging session or so). > > Computer software should not rely (or suffer) from any additional > > (or lack thereof) whitespace there is... > > > > ... or at least a really convicing argument for the chance needs to > > be presented (before "restricting" the json output notmuch spits out). > > Given a JSON parser that only knows how to parse complete JSON > expressions, it's potentially very inefficient to keep attempting to > parse something when you don't know if it's complete. The newlines > provide an in-band framing so the consumer knows when there's a > complete object to be parsed. > > In effect, this defines a super-protocol of JSON that's compatible > with standard JSON, but easy to incrementally parse. > > That said, just this weekend I implemented JSON-based search with > incremental JSON parsing and I took a slightly different approach. I > still put framing into the newlines of the search results, but rather > than rely on it for correctness, the consumer uses it as an > optimization that only hints that a complete JSON expression is > probably available. If the expression turns out to be incomplete, > that's okay. > > I considered building a fully-incremental JSON parser that never > backtracks by more than a token, which would eliminate even the cost > of reparsing, but if we do move to S-expressions (which I think we > should), we want to let Emacs' C implementation do as much of the > parsing as possible, and the only thing we can do with that is read a > complete expression. Actually, I take that back. While we can't do fast incremental S-expression parsing, `parse-partial-sexp' can tell us incrementally (and probably very quickly) *if* there's a complete expression ready to parse, so we could avoid calling into the parser at all unless it would succeed. I'll try this out in my incremental JSON parser and see how well it works. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Quoth Tomi Ollila on Jul 02 at 1:13 am: > On Sat, Jun 30 2012, Mark Walters wrote: > > > Add newlines between complete threads to make asynchronous parsing > > of the JSON easier. > > --- > > > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > > for many threads. This can take quite a long time when displaying a > > large number of messages (say 20 seconds for the 10,000 messages in > > the notmuch archive). Thus it is desirable to display results > > incrementally in the same way that search currently does. > > > > To make this easier this patch adds newlines between each toplevel > > thread. So the ouput becomes > > > > [ > > thread1 > > , thread2 > > , thread3 > > ... > > , last_thread > > ] > > > > Thus the parser can easily tell if it has enough data to do some more > > parsing. > > > > Obviously, this changes the JSON output. This should not break any > > consumer as the JSON parsers should not mind. However, it does break > > several tests. Obviously, I will fix these but I wanted to check if > > people were basically happy with the change first. > > To provide this feature rather than relying on newlines the parser should > use it's state to notice when one thread ends. > > Such a change could be used (privately) for human consumption -- allowing > free change of whitespace during inspection (in a debugging session or so). > Computer software should not rely (or suffer) from any additional > (or lack thereof) whitespace there is... > > ... or at least a really convicing argument for the chance needs to > be presented (before "restricting" the json output notmuch spits out). Given a JSON parser that only knows how to parse complete JSON expressions, it's potentially very inefficient to keep attempting to parse something when you don't know if it's complete. The newlines provide an in-band framing so the consumer knows when there's a complete object to be parsed. In effect, this defines a super-protocol of JSON that's compatible with standard JSON, but easy to incrementally parse. That said, just this weekend I implemented JSON-based search with incremental JSON parsing and I took a slightly different approach. I still put framing into the newlines of the search results, but rather than rely on it for correctness, the consumer uses it as an optimization that only hints that a complete JSON expression is probably available. If the expression turns out to be incomplete, that's okay. I considered building a fully-incremental JSON parser that never backtracks by more than a token, which would eliminate even the cost of reparsing, but if we do move to S-expressions (which I think we should), we want to let Emacs' C implementation do as much of the parsing as possible, and the only thing we can do with that is read a complete expression. > Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, > including newlines outside strings). So I quess notmuch-pick uses something > slightly different (probably using json.el subroutines).. > > Btw2: I'm very interested to see notmuch-pick in action -- I just don't > see this a way to do this particular support properly. > > Btw3: is search is ever going to use json we'll face the same problem -- > unless writing each line as a separate json object (and starting to use > s-expressions for speed) Done. I'll post the patches after a little more cleanup. > > Also, should devel/schemata be updated? It seems a little unclear as > > this is not really a "JSON" change as the JSON does not care about the > > newlines. > > > > Best wishes > > and best luck with your notmuch-pick work. > > > > > Mark > > Tomi
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Quoth Tomi Ollila on Jul 02 at 1:13 am: > On Sat, Jun 30 2012, Mark Walters wrote: > > > Add newlines between complete threads to make asynchronous parsing > > of the JSON easier. > > --- > > > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > > for many threads. This can take quite a long time when displaying a > > large number of messages (say 20 seconds for the 10,000 messages in > > the notmuch archive). Thus it is desirable to display results > > incrementally in the same way that search currently does. > > > > To make this easier this patch adds newlines between each toplevel > > thread. So the ouput becomes > > > > [ > > thread1 > > , thread2 > > , thread3 > > ... > > , last_thread > > ] > > > > Thus the parser can easily tell if it has enough data to do some more > > parsing. > > > > Obviously, this changes the JSON output. This should not break any > > consumer as the JSON parsers should not mind. However, it does break > > several tests. Obviously, I will fix these but I wanted to check if > > people were basically happy with the change first. > > To provide this feature rather than relying on newlines the parser should > use it's state to notice when one thread ends. > > Such a change could be used (privately) for human consumption -- allowing > free change of whitespace during inspection (in a debugging session or so). > Computer software should not rely (or suffer) from any additional > (or lack thereof) whitespace there is... > > ... or at least a really convicing argument for the chance needs to > be presented (before "restricting" the json output notmuch spits out). Given a JSON parser that only knows how to parse complete JSON expressions, it's potentially very inefficient to keep attempting to parse something when you don't know if it's complete. The newlines provide an in-band framing so the consumer knows when there's a complete object to be parsed. In effect, this defines a super-protocol of JSON that's compatible with standard JSON, but easy to incrementally parse. That said, just this weekend I implemented JSON-based search with incremental JSON parsing and I took a slightly different approach. I still put framing into the newlines of the search results, but rather than rely on it for correctness, the consumer uses it as an optimization that only hints that a complete JSON expression is probably available. If the expression turns out to be incomplete, that's okay. I considered building a fully-incremental JSON parser that never backtracks by more than a token, which would eliminate even the cost of reparsing, but if we do move to S-expressions (which I think we should), we want to let Emacs' C implementation do as much of the parsing as possible, and the only thing we can do with that is read a complete expression. > Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, > including newlines outside strings). So I quess notmuch-pick uses something > slightly different (probably using json.el subroutines).. > > Btw2: I'm very interested to see notmuch-pick in action -- I just don't > see this a way to do this particular support properly. > > Btw3: is search is ever going to use json we'll face the same problem -- > unless writing each line as a separate json object (and starting to use > s-expressions for speed) Done. I'll post the patches after a little more cleanup. > > Also, should devel/schemata be updated? It seems a little unclear as > > this is not really a "JSON" change as the JSON does not care about the > > newlines. > > > > Best wishes > > and best luck with your notmuch-pick work. > > > > > Mark > > Tomi ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Sun, 01 Jul 2012, Tomi Ollila wrote: > On Sat, Jun 30 2012, Mark Walters wrote: > >> Add newlines between complete threads to make asynchronous parsing >> of the JSON easier. >> --- >> >> notmuch-pick uses the JSON output of notmuch show but, in many cases, >> for many threads. This can take quite a long time when displaying a >> large number of messages (say 20 seconds for the 10,000 messages in >> the notmuch archive). Thus it is desirable to display results >> incrementally in the same way that search currently does. >> >> To make this easier this patch adds newlines between each toplevel >> thread. So the ouput becomes >> >> [ >> thread1 >> , thread2 >> , thread3 >> ... >> , last_thread >> ] >> >> Thus the parser can easily tell if it has enough data to do some more >> parsing. >> >> Obviously, this changes the JSON output. This should not break any >> consumer as the JSON parsers should not mind. However, it does break >> several tests. Obviously, I will fix these but I wanted to check if >> people were basically happy with the change first. > > To provide this feature rather than relying on newlines the parser should > use it's state to notice when one thread ends. > > Such a change could be used (privately) for human consumption -- allowing > free change of whitespace during inspection (in a debugging session or so). > Computer software should not rely (or suffer) from any additional > (or lack thereof) whitespace there is... > > ... or at least a really convicing argument for the chance needs to > be presented (before "restricting" the json output notmuch spits out). > > Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, > including newlines outside strings). So I quess notmuch-pick uses something > slightly different (probably using json.el subroutines).. I was following Austin's suggestion (on irc and id:"20120214152114.gq27...@mit.edu"). The idea is that each thread in the JSON output is an entire JSON object. Thus pick skips the first [ and the waits until there are two \n's in the incoming stream. Then it knows that the complete first thread has been received and it parses that with json-read as normal. The important thing is that it is trivial to tell when a complete (and so parsable) JSON object has arrived. It seems to work, but I am definitely open to other approaches. > Btw2: I'm very interested to see notmuch-pick in action -- I just don't > see this a way to do this particular support properly. > > Btw3: is search is ever going to use json we'll face the same problem -- > unless writing each line as a separate json object (and starting to use > s-expressions for speed) > >> Also, should devel/schemata be updated? It seems a little unclear as >> this is not really a "JSON" change as the JSON does not care about the >> newlines. >> >> Best wishes > and best luck with your notmuch-pick work. Thanks! Mark >> >> notmuch-show.c |5 + >> 1 files changed, 5 insertions(+), 0 deletions(-) >> >> diff --git a/notmuch-show.c b/notmuch-show.c >> index 195e318..4a1d699 100644 >> --- a/notmuch-show.c >> +++ b/notmuch-show.c >> @@ -942,6 +942,8 @@ do_show (void *ctx, >> >> if (format->message_set_start) >> fputs (format->message_set_start, stdout); >> +if (format == &format_json) >> +fputs ("\n", stdout); >> >> for (threads = notmuch_query_search_threads (query); >> notmuch_threads_valid (threads); >> @@ -963,6 +965,9 @@ do_show (void *ctx, >> if (status && !res) >> res = status; >> >> +if (format == &format_json) >> +fputs ("\n", stdout); >> + >> notmuch_thread_destroy (thread); >> >> } >> -- >> 1.7.9.1 >> >> ___ >> notmuch mailing list >> notmuch@notmuchmail.org >> http://notmuchmail.org/mailman/listinfo/notmuch ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
On Sat, Jun 30 2012, Mark Walters wrote: > Add newlines between complete threads to make asynchronous parsing > of the JSON easier. > --- > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > for many threads. This can take quite a long time when displaying a > large number of messages (say 20 seconds for the 10,000 messages in > the notmuch archive). Thus it is desirable to display results > incrementally in the same way that search currently does. > > To make this easier this patch adds newlines between each toplevel > thread. So the ouput becomes > > [ > thread1 > , thread2 > , thread3 > ... > , last_thread > ] > > Thus the parser can easily tell if it has enough data to do some more > parsing. > > Obviously, this changes the JSON output. This should not break any > consumer as the JSON parsers should not mind. However, it does break > several tests. Obviously, I will fix these but I wanted to check if > people were basically happy with the change first. To provide this feature rather than relying on newlines the parser should use it's state to notice when one thread ends. Such a change could be used (privately) for human consumption -- allowing free change of whitespace during inspection (in a debugging session or so). Computer software should not rely (or suffer) from any additional (or lack thereof) whitespace there is... ... or at least a really convicing argument for the chance needs to be presented (before "restricting" the json output notmuch spits out). Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, including newlines outside strings). So I quess notmuch-pick uses something slightly different (probably using json.el subroutines).. Btw2: I'm very interested to see notmuch-pick in action -- I just don't see this a way to do this particular support properly. Btw3: is search is ever going to use json we'll face the same problem -- unless writing each line as a separate json object (and starting to use s-expressions for speed) > Also, should devel/schemata be updated? It seems a little unclear as > this is not really a "JSON" change as the JSON does not care about the > newlines. > > Best wishes and best luck with your notmuch-pick work. > > Mark Tomi > > > notmuch-show.c |5 + > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/notmuch-show.c b/notmuch-show.c > index 195e318..4a1d699 100644 > --- a/notmuch-show.c > +++ b/notmuch-show.c > @@ -942,6 +942,8 @@ do_show (void *ctx, > > if (format->message_set_start) > fputs (format->message_set_start, stdout); > +if (format == &format_json) > + fputs ("\n", stdout); > > for (threads = notmuch_query_search_threads (query); >notmuch_threads_valid (threads); > @@ -963,6 +965,9 @@ do_show (void *ctx, > if (status && !res) > res = status; > > + if (format == &format_json) > + fputs ("\n", stdout); > + > notmuch_thread_destroy (thread); > > } > -- > 1.7.9.1 > > ___ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Hi Mark. Mark Walters writes: > Add newlines between complete threads to make asynchronous parsing > of the JSON easier. > --- > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > for many threads. This can take quite a long time when displaying a > large number of messages (say 20 seconds for the 10,000 messages in > the notmuch archive). Thus it is desirable to display results > incrementally in the same way that search currently does. > > To make this easier this patch adds newlines between each toplevel > thread. So the ouput becomes > > [ > thread1 > , thread2 > , thread3 > ... > , last_thread > ] > > Thus the parser can easily tell if it has enough data to do some more > parsing. > > Obviously, this changes the JSON output. This should not break any > consumer as the JSON parsers should not mind. However, it does break > several tests. I think this should be part of the commit message since it explains the rationale behind the change. Regards, Dmitry > Obviously, I will fix these but I wanted to check if > people were basically happy with the change first. > > Also, should devel/schemata be updated? It seems a little unclear as > this is not really a "JSON" change as the JSON does not care about the > newlines. > > Best wishes > > Mark > > > notmuch-show.c |5 + > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/notmuch-show.c b/notmuch-show.c > index 195e318..4a1d699 100644 > --- a/notmuch-show.c > +++ b/notmuch-show.c > @@ -942,6 +942,8 @@ do_show (void *ctx, > > if (format->message_set_start) > fputs (format->message_set_start, stdout); > +if (format == &format_json) > + fputs ("\n", stdout); > > for (threads = notmuch_query_search_threads (query); >notmuch_threads_valid (threads); > @@ -963,6 +965,9 @@ do_show (void *ctx, > if (status && !res) > res = status; > > + if (format == &format_json) > + fputs ("\n", stdout); > + > notmuch_thread_destroy (thread); > > } > -- > 1.7.9.1 > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Add newlines between complete threads to make asynchronous parsing of the JSON easier. --- notmuch-pick uses the JSON output of notmuch show but, in many cases, for many threads. This can take quite a long time when displaying a large number of messages (say 20 seconds for the 10,000 messages in the notmuch archive). Thus it is desirable to display results incrementally in the same way that search currently does. To make this easier this patch adds newlines between each toplevel thread. So the ouput becomes [ thread1 , thread2 , thread3 ... , last_thread ] Thus the parser can easily tell if it has enough data to do some more parsing. Obviously, this changes the JSON output. This should not break any consumer as the JSON parsers should not mind. However, it does break several tests. Obviously, I will fix these but I wanted to check if people were basically happy with the change first. Also, should devel/schemata be updated? It seems a little unclear as this is not really a "JSON" change as the JSON does not care about the newlines. Best wishes Mark notmuch-show.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/notmuch-show.c b/notmuch-show.c index 195e318..4a1d699 100644 --- a/notmuch-show.c +++ b/notmuch-show.c @@ -942,6 +942,8 @@ do_show (void *ctx, if (format->message_set_start) fputs (format->message_set_start, stdout); +if (format == &format_json) + fputs ("\n", stdout); for (threads = notmuch_query_search_threads (query); notmuch_threads_valid (threads); @@ -963,6 +965,9 @@ do_show (void *ctx, if (status && !res) res = status; + if (format == &format_json) + fputs ("\n", stdout); + notmuch_thread_destroy (thread); } -- 1.7.9.1
Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Hi Mark. Mark Walters writes: > Add newlines between complete threads to make asynchronous parsing > of the JSON easier. > --- > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > for many threads. This can take quite a long time when displaying a > large number of messages (say 20 seconds for the 10,000 messages in > the notmuch archive). Thus it is desirable to display results > incrementally in the same way that search currently does. > > To make this easier this patch adds newlines between each toplevel > thread. So the ouput becomes > > [ > thread1 > , thread2 > , thread3 > ... > , last_thread > ] > > Thus the parser can easily tell if it has enough data to do some more > parsing. > > Obviously, this changes the JSON output. This should not break any > consumer as the JSON parsers should not mind. However, it does break > several tests. I think this should be part of the commit message since it explains the rationale behind the change. Regards, Dmitry > Obviously, I will fix these but I wanted to check if > people were basically happy with the change first. > > Also, should devel/schemata be updated? It seems a little unclear as > this is not really a "JSON" change as the JSON does not care about the > newlines. > > Best wishes > > Mark > > > notmuch-show.c |5 + > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/notmuch-show.c b/notmuch-show.c > index 195e318..4a1d699 100644 > --- a/notmuch-show.c > +++ b/notmuch-show.c > @@ -942,6 +942,8 @@ do_show (void *ctx, > > if (format->message_set_start) > fputs (format->message_set_start, stdout); > +if (format == &format_json) > + fputs ("\n", stdout); > > for (threads = notmuch_query_search_threads (query); >notmuch_threads_valid (threads); > @@ -963,6 +965,9 @@ do_show (void *ctx, > if (status && !res) > res = status; > > + if (format == &format_json) > + fputs ("\n", stdout); > + > notmuch_thread_destroy (thread); > > } > -- > 1.7.9.1 > > ___ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] cli: notmuch-show with framing newlines between threads in JSON.
Add newlines between complete threads to make asynchronous parsing of the JSON easier. --- notmuch-pick uses the JSON output of notmuch show but, in many cases, for many threads. This can take quite a long time when displaying a large number of messages (say 20 seconds for the 10,000 messages in the notmuch archive). Thus it is desirable to display results incrementally in the same way that search currently does. To make this easier this patch adds newlines between each toplevel thread. So the ouput becomes [ thread1 , thread2 , thread3 ... , last_thread ] Thus the parser can easily tell if it has enough data to do some more parsing. Obviously, this changes the JSON output. This should not break any consumer as the JSON parsers should not mind. However, it does break several tests. Obviously, I will fix these but I wanted to check if people were basically happy with the change first. Also, should devel/schemata be updated? It seems a little unclear as this is not really a "JSON" change as the JSON does not care about the newlines. Best wishes Mark notmuch-show.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/notmuch-show.c b/notmuch-show.c index 195e318..4a1d699 100644 --- a/notmuch-show.c +++ b/notmuch-show.c @@ -942,6 +942,8 @@ do_show (void *ctx, if (format->message_set_start) fputs (format->message_set_start, stdout); +if (format == &format_json) + fputs ("\n", stdout); for (threads = notmuch_query_search_threads (query); notmuch_threads_valid (threads); @@ -963,6 +965,9 @@ do_show (void *ctx, if (status && !res) res = status; + if (format == &format_json) + fputs ("\n", stdout); + notmuch_thread_destroy (thread); } -- 1.7.9.1 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch