Re: First Impressions!

Steven Schveighoffer via Digitalmars-d Thu, 30 Nov 2017 11:43:25 -0800

On 11/30/17 1:20 PM, Patrick Schluter wrote:

On Thursday, 30 November 2017 at 17:40:08 UTC, Jonathan M Davis wrote:
English and thus don't as easily hit the cases where their code iswrong. For better or worse, UTF-16 hides it better than UTF-8, but theproblem exists in both.
To give just an example of what can go wrong with UTF-16. Reading a filein UTF-16 and converting it tosomething else like UTF-8 or UTF-32.Reading block by block and hitting exactly a SMP codepoint at the bufferlimit, high surrogate at the end of the first buffer, low surrogate atthe start of the next. If you don't think about it => 2 invalidcharacters instead of your nice poop 💩 emoji character (emojis are inthe SMP and they are more and more frequent).

iopipe handles this:http://schveiguy.github.io/iopipe/iopipe/textpipe/ensureDecodeable.html


-Steve

Re: First Impressions!

Reply via email to