On 8/23/14, 3:24 PM, Walter Bright wrote:
On 8/23/2014 2:36 PM, Andrei Alexandrescu wrote:
I think accepting ubyte it's a good idea. It means "got this stream of
bytes off
of the wire and it hasn't been validated as a UTF string". It also
means (which
is true) that the lexer does enough validation to constrain arbitrary
bytes into
text, and saves caller from either a check (expensive) or a cast
(unpleasant).
Reality is the JSON lexer takes ubytes and produces tokens.
Using an adapter still makes sense, because:
1. The adapter should be just as fast as wiring it in internally
2. The adapter then becomes a general purpose tool that can be used
elsewhere where the encoding is unknown or suspect
3. The scope of the adapter is small, so it is easier to get it right,
and being reusable means every user benefits from it
4. If we can't make adapters efficient, we've failed at the
ranges+algorithms model, and I'm very unwilling to fail at that
An adapter would solve the wrong problem here. There's nothing to adapt
from and to.
An adapter would be good if e.g. the stream uses UTF-16 or some Windows
encoding. Bytes are the natural input for a json parser.
Andrei