I've been poking at the partial token logic. The json_errdetail() bug mentioned upthread (e.g. for an invalid input `[12zz]` and small chunk size) seems to be due to the disconnection between the "main" lex instance and the dummy_lex that's created from it. The dummy_lex contains all the information about the failed token, which is discarded upon an error return:
> partial_result = json_lex(&dummy_lex); > if (partial_result != JSON_SUCCESS) > return partial_result; In these situations, there's an additional logical error: lex->token_start is pointing to a spot in the string after lex->token_terminator, which breaks an invariant that will mess up later pointer math. Nothing appears to be setting lex->token_start to point into the partial token buffer until _after_ the partial token is successfully lexed, which doesn't seem right -- in addition to the pointer math problems, if a previous chunk was freed (or on a stale stack frame), lex->token_start will still be pointing off into space. Similarly, wherever we set token_terminator, we need to know that token_start is pointing into the same buffer. Determining the end of a token is now done in two separate places between the partial- and full-lexer code paths, which is giving me a little heartburn. I'm concerned that those could drift apart, and if the two disagree on where to end a token, we could lose data into the partial token buffer in a way that would be really hard to debug. Is there a way to combine them? Thanks, --Jacob