On Tue, 14 Sep 2010 11:18:14 -0400, Sean Kelly <[email protected]>
wrote:
Robert Jacques Wrote:
On Mon, 13 Sep 2010 14:30:10 -0400, Sean Kelly <[email protected]>
wrote:
>
> Could all this sit atop a SAX-style API? I'm not likely to ever use
an
> API that requires memory allocation for parsing or writing data.
Well, writing data could be done using output ranges easily enough, so
no
extra memory writing troubles there. As for parsing, the biggest cost
with
JSON is that fact that all strings can include escape chars, so things
have to be copied instead of sliced.
What I've always done is to not automatically unescape string data but
rather provide a function for the user to do it so they can provide the
buffer. Alternately, this behavior could be configurable. Escaping
output should definitely be configurable though. In fact, I often don't
even want numbers to be automatically converted from their string to
real/int representation, since it's common for me to want to operate on
the value as a string. So even this I like being given the original
representation and calling to!int or whatever on my own.
However, there's nothing preventing a
SAX style implementation in the format itself. Except that JSON has less
extra meta-data than XML so SAX becomes a less informative. Instead of:
object start vector
member x
number 42.8
object end vector
you have something like
object start
member vector
object start
member x
number 42.8
object end
object end
For myself, the files are under a mb and random access makes everything
much faster to program and debug.
Random access is definitely nice, it's more the performance cost of all
those allocations that's an issue for me. What I'm basically looking
for is a set of events like this:
alias void delegate(char[]) ParseEvent;
ParseEvent onObjectEnter, onObjectKey, onObjectLeave;
ParseEvent onArrayEnter, onArrayLeave;
ParseEvent onStringValue, onFloatValue, onIntValue;
ParseEvent onTrueValue, onFalseValue, onNullValue;
With corresponding write events on the output side so if I hooked the
parser to the writer the data would all flow through and generate output
identical to the input, formatting notwithstanding (though I'd add the
option to write numbers as either a string, real, or int). I like the
event parameter being a char[] because it allows me to unescape string
data in place, etc.
There are some advanced-mode writer options I'd like as well, like the
ability to dump a char[] blob directly into the destination string
without translation, saving and restoring writer state, etc.
I don't know if anyone besides myself would find all this useful
though. These are just some things I've found necessary for the work I
do.
This seems pretty straight forward. Could you list the advanced-mode
features you'd need?