On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:
+- Unicode support is good. Although I think D's string type should have
probably been utf16 by default. Especially considering the utf module states:
"UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'."
Seems like the natural fit for me. Plus for the vast majority of use cases I am
pretty guaranteed a char = codepoint. Not the biggest issue in the world and
maybe I'm just being overly critical here.
Sooner or later your code will exhibit bugs if it assumes that char==codepoint
with UTF16, because of surrogate pairs.
https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java
As far as I can tell, pretty much the only users of UTF16 are Windows programs.
Everyone else uses UTF8 or UCS32.
I recommend using UTF8.