On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:
+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states:

"UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'."

Seems like the natural fit for me. Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint. Not the biggest issue in the world and maybe I'm just being overly critical here.

Sooner or later your code will exhibit bugs if it assumes that char==codepoint with UTF16, because of surrogate pairs.

https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java

As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32.

I recommend using UTF8.

Reply via email to