For XML, template the parser on char type so transcoding is unnecessary. Since 
JSON is UTF-8 I'd use char there, and at least for the event parser don't 
proactively decode strings--let the user do this. In fact, don't proactively 
decode anything. Give me the option of getting a number via its string 
representation directly from the input buffer. Roughly, JSON events should be:

Enter object
Object key
Int value (as string)
Float value (as string)
Null
True
False
Etc. 

On Feb 8, 2012, at 6:49 PM, "Robert Jacques" <[email protected]> wrote:

> On Wed, 08 Feb 2012 02:12:57 -0600, Johannes Pfau <[email protected]> wrote:
>> Am Tue, 07 Feb 2012 20:44:08 -0500
>> schrieb "Jonathan M Davis" <[email protected]>:
>>> On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
>>> > On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis
> [snip]
>> 
>> Using ranges of dchar directly can be horribly inefficient in some
>> cases, you'll need at least some kind off buffered dchar range. Some
>> std.json replacement code tried to use only dchar ranges and had to
>> reassemble strings character by character using Appender. That sucks
>> especially if you're only interested in a small part of the data and
>> don't care about the rest.
>> So for pull/sax parsers: Use buffering, return strings(better:
>> w/d/char[]) as slices to that buffer. If the user needs to keep a
>> string, he can still copy it. (String decoding should also be done
>> on-demand only).
> 
> Speaking as the one proposing said Json replacement, I'd like to point out 
> that JSON strings != UTF strings: manual conversion is required some of the 
> time. And I use appender as a dynamic buffer in exactly the manner you 
> suggest. There's even an option to use a string cache to minimize total 
> memory usage. (Hmm... that functionality should probably be re-factored out 
> and made into its own utility) That said, I do end up doing a bunch of 
> useless encodes and decodes, so I'm going to special case those away and add 
> slicing support for strings. wstrings and dstring will still need to be 
> converted as currently Json values only accept strings and therefore also 
> Json tokens only support strings. As a potential user of the sax/pull 
> interface would you prefer the extra clutter of special side channels for 
> zero-copy wstrings and dstrings?

Reply via email to