Re: First Impressions!

A Guy With a Question via Digitalmars-d Thu, 30 Nov 2017 05:21:12 -0800

On Thursday, 30 November 2017 at 10:19:18 UTC, Walter Brightwrote:

On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:
+- Unicode support is good. Although I think D's string typeshould have probably been utf16 by default. Especiallyconsidering the utf module states:
"UTF character support is restricted to '\u0000' <= character<= '\U0010FFFF'."
Seems like the natural fit for me. Plus for the vast majorityof use cases I am pretty guaranteed a char = codepoint. Notthe biggest issue in the world and maybe I'm just being overlycritical here.
Sooner or later your code will exhibit bugs if it assumes thatchar==codepoint with UTF16, because of surrogate pairs.
https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java
As far as I can tell, pretty much the only users of UTF16 areWindows programs. Everyone else uses UTF8 or UCS32.
I recommend using UTF8.

As long as you understand it's limitations I think most bugs canbe avoided. Where UTF16 breaks down, is pretty well defined.Also, super rare. I think UTF32 would be great to, but it seemslike just a waste of space 99% of the time. UTF8 isn't horrible,I am not going to never use D because it uses UTF8 (that would besilly). Especially when wstring also seems baked into thelanguage. However, it can complicate code because you pretty muchalways have to assume character != codepoint outside of ASCII. Ican see a reasonable person arguing that it forcing you assumecharacter != code point is actually a good thing. And that is avalid opinion.

Re: First Impressions!

Reply via email to