For XML, template the parser on char type so transcoding is unnecessary. Since JSON is UTF-8 I'd use char there, and at least for the event parser don't proactively decode strings--let the user do this. In fact, don't proactively decode anything. Give me the option of getting a number via its string representation directly from the input buffer. Roughly, JSON events should be:
Enter object Object key Int value (as string) Float value (as string) Null True False Etc. On Feb 8, 2012, at 6:49 PM, "Robert Jacques" <[email protected]> wrote: > On Wed, 08 Feb 2012 02:12:57 -0600, Johannes Pfau <[email protected]> wrote: >> Am Tue, 07 Feb 2012 20:44:08 -0500 >> schrieb "Jonathan M Davis" <[email protected]>: >>> On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote: >>> > On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis > [snip] >> >> Using ranges of dchar directly can be horribly inefficient in some >> cases, you'll need at least some kind off buffered dchar range. Some >> std.json replacement code tried to use only dchar ranges and had to >> reassemble strings character by character using Appender. That sucks >> especially if you're only interested in a small part of the data and >> don't care about the rest. >> So for pull/sax parsers: Use buffering, return strings(better: >> w/d/char[]) as slices to that buffer. If the user needs to keep a >> string, he can still copy it. (String decoding should also be done >> on-demand only). > > Speaking as the one proposing said Json replacement, I'd like to point out > that JSON strings != UTF strings: manual conversion is required some of the > time. And I use appender as a dynamic buffer in exactly the manner you > suggest. There's even an option to use a string cache to minimize total > memory usage. (Hmm... that functionality should probably be re-factored out > and made into its own utility) That said, I do end up doing a bunch of > useless encodes and decodes, so I'm going to special case those away and add > slicing support for strings. wstrings and dstring will still need to be > converted as currently Json values only accept strings and therefore also > Json tokens only support strings. As a potential user of the sax/pull > interface would you prefer the extra clutter of special side channels for > zero-copy wstrings and dstrings?
