Here's my destruction of std.data.json.

* lexer.d:

** Beautifully done. From what I understand, if the input is string or immutable(ubyte)[] then the strings are carved out as slices of the input, as opposed to newly allocated. Awesome.

** The string after lexing is correctly scanned and stored in raw format (escapes are not rewritten) and decoded on demand. Problem with decoding is that it may allocate memory, and it would be great (and not difficult) to make the lexer 100% lazy/non-allocating. To achieve that, lexer.d should define TWO "Kind"s of strings at the lexer level: regular string and undecoded string. The former is lexer.d's way of saying "I got lucky" in the sense that it didn't detect any '\\' so the raw and decoded strings are identical. No need for anyone to do any further processing in the majority of cases => win. The latter means the lexer lexed the string, saw at least one '\\', and leaves it to the caller to do the actual decoding.

** After moving the decoding business out of lexer.d, a way to take this further would be to qualify lexer methods as @nogc if the input is string/immutable(ubyte)[]. I wonder how to implement a conditional attribute. We'll probably need a language enhancement for that.

** The implementation uses manually-defined tagged unions for work. Could we use Algebraic instead - dogfooding and all that? I recall there was a comment in Sönke's original work that Algebraic has a specific issue (was it false pointers?) - so the question arises, should we fix Algebraic and use it thus helping other uses as well?

** I see the "boolean" kind, should we instead have the "true_" and "false_" kinds?

** Long story short I couldn't find any major issue with this module, and I looked! I do think the decoding logic should be moved outside of lexer.d or at least the JSONLexerRange.

* generator.d: looking good, no special comments. Like the consistent use of structs filled with options as template parameters.

* foundation.d:

** At four words per token, Location seems pretty bulky. How about reducing line and column to uint?

** Could JSONException create the message string in toString (i.e. when/if used) as opposed to in the constructor?

* parser.d:

** How about using .init instead of .defaults for options?

** I'm a bit surprised by JSONParserNode.Kind. E.g. the objectStart/End markers shouldn't appear as nodes. There should be an "object" node only. I guess that's needed for laziness.

** It's unclear where memory is being allocated in the parser. @nogc annotations wherever appropriate would be great.

* value.d:

** Looks like this is/may be the only place where memory is being managed, at least if the input is string/immutable(ubyte)[]. Right?

** Algebraic ftw.

============================

Overall: This is very close to everything I hoped! A bit more care to @nogc would be awesome, especially with the upcoming focus on memory management going forward.

After one more pass it would be great to move forward for review.


Andrei

Reply via email to