On Sunday, 12 October 2014 at 18:17:29 UTC, Andrei Alexandrescu wrote:

** The string after lexing is correctly scanned and stored in raw format (escapes are not rewritten) and decoded on demand. Problem with decoding is that it may allocate memory, and it would be great (and not difficult) to make the lexer 100% lazy/non-allocating. To achieve that, lexer.d should define TWO "Kind"s of strings at the lexer level: regular string and undecoded string. The former is lexer.d's way of saying "I got lucky" in the sense that it didn't detect any '\\' so the raw and decoded strings are identical. No need for anyone to do any further processing in the majority of cases => win. The latter means the lexer lexed the string, saw at least one '\\', and leaves it to the caller to do the actual decoding.

I'd like to see unescapeStringLiteral() made public. Then I can unescape multiple strings to the same preallocated destination, or even unescape in place (guaranteed to work since the result will always be smaller than the input).

Reply via email to