Robert Jacques Wrote:

> On Mon, 13 Sep 2010 14:30:10 -0400, Sean Kelly <[email protected]>  
> wrote:
> 
> >
> > Could all this sit atop a SAX-style API?  I'm not likely to ever use an  
> > API that requires memory allocation for parsing or writing data.
> 
> Well, writing data could be done using output ranges easily enough, so no  
> extra memory writing troubles there. As for parsing, the biggest cost with  
> JSON is that fact that all strings can include escape chars, so things  
> have to be copied instead of sliced.

What I've always done is to not automatically unescape string data but rather 
provide a function for the user to do it so they can provide the buffer.  
Alternately, this behavior could be configurable.  Escaping output should 
definitely be configurable though.  In fact, I often don't even want numbers to 
be automatically converted from their string to real/int representation, since 
it's common for me to want to operate on the value as a string.  So even this I 
like being given the original representation and calling to!int or whatever on 
my own.

> However, there's nothing preventing a  
> SAX style implementation in the format itself. Except that JSON has less  
> extra meta-data than XML so SAX becomes a less informative. Instead of:
> 
> object start vector
> member x
> number 42.8
> object end vector
> 
> you have something like
> 
> object start
> member vector
> object start
> member x
> number 42.8
> object end
> object end
> 
> For myself, the files are under a mb and random access makes everything  
> much faster to program and debug.

Random access is definitely nice, it's more the performance cost of all those 
allocations that's an issue for me.  What I'm basically looking for is a set of 
events like this:

alias void delegate(char[]) ParseEvent;
ParseEvent onObjectEnter, onObjectKey, onObjectLeave;
ParseEvent onArrayEnter, onArrayLeave;
ParseEvent onStringValue, onFloatValue, onIntValue;
ParseEvent onTrueValue, onFalseValue, onNullValue;

With corresponding write events on the output side so if I hooked the parser to 
the writer the data would all flow through and generate output identical to the 
input, formatting notwithstanding (though I'd add the option to write numbers 
as either a string, real, or int).  I like the event parameter being a char[] 
because it allows me to unescape string data in place, etc.

There are some advanced-mode writer options I'd like as well, like the ability 
to dump a char[] blob directly into the destination string without translation, 
saving and restoring writer state, etc.

I don't know if anyone besides myself would find all this useful though.  These 
are just some things I've found necessary for the work I do.

Reply via email to