On Thu, May 24, 2012 at 12:05:13PM -0500, Tim Caswell wrote:
> On Thu, May 24, 2012 at 11:52 AM, Isaac Schlueter <[email protected]> wrote:
>
> > On Wed, May 23, 2012 at 3:00 PM, Mark Hahn <[email protected]> wrote:
> > >> it seems that the work on the client side to do streaming parsing gets
> > >> much harder
> > >
> > > I don't understand? Parsing commas is hard? However you planned on
> > > parsing newlines could parse commas instead.
> >
> > It goes from trivial (because you don't have to inspect the JSON at
> > all) to not-trivial (because you do).
> >
> > JSON can contain commas, but unless it's created using a pretty-indent
> > argument, it can never contain newlines. This means that your parser
> > only has to be aware of a single byte, and can dumbly skip over
> > everything else.
> >
> > I think what we need is a new standard for \n delimited JSON streams.
> > It addresses a slightly different need (since you won't ever parse the
> > whole thing all at once, and it may not even ever end), and requires
> > the sender to not send pretty-formatted JSON, so that each object is
> > guaranteed to be a single line.
> >
> > Actually, I think that's basically the spec:
> >
> > 1. Lines are delimited by \n (0x0A)
> > 2. Each line must be a valid JSON string in UTF-8 encoding.
> >
> > The only thing we're lacking is a mime-type.
> application/x-json-stream -> JSON messages (without extra whitespace)
> newline separated.
>
> Several streaming json parsers support this already out of the box. They
> ignore the newlines and know when a json body ends because of the parser
> state. People who don't have a streaming parser, can search for the
> newline instead and do their own de-framing. Either way, it's very trivial
> to parse.
>
> var parser = new StreamingParser({ multivalue: true });
> req.pipe(parser);
> parser.on("message", function (message) {
> // ...
> });
I'm using a JSON string per line as a somewhat durable file format for a b-tree.
http://bigeasy.github.com/strata/
The notion here is that the parser does not ignore the newline. A verification
program can break the file into lines so it can detect corrupt lines and
continue. With a parser a corrupt line would make it non-trivial to continue,
unless you used the newline as an marker, at which point you might as well use
it as the delimiter.
I've gone and added a hexadecimal checksum to the end of the line. The checksum
is a hexadecimal number or a hyphen for no checksum. The checksum algorithm is
any algorithm that emits a number.
Basically, checksummed frames of newline delimited JSON.
--
Alan Gutierrez - @bigeasy
--
Job Board: http://jobs.nodejs.org/
Posting guidelines:
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en