On Thu, May 02, 2024 at 11:23:13AM +0900, Michael Paquier wrote: > About the fact that we may finish by printing unfinished UTF-8 > sequences, I'd be curious to hear your thoughts. Now, the information > provided about the partial byte sequences can be also useful for > debugging on top of having the error code, no?
By the way, as long as I have that in mind.. I am not sure that it is worth spending cycles in detecting the unfinished sequences and make these printable. Wouldn't it be enough for more cases to adjust token_error() to truncate the byte sequences we cannot print? Another thing that I think would be nice would be to calculate the location of what we're parsing on a given line, and provide that in the error context. That would not be backpatchable as it requires a change in JsonLexContext, unfortunately, but it would help in making more sense with an error if the incomplete byte sequence is at the beginning of a token or after an expected character. -- Michael
signature.asc
Description: PGP signature