Re: Relaxing the definition of isSomeString and isNarrowString

Andrew Godfrey via Digitalmars-d Sun, 24 Aug 2014 17:16:24 -0700

On Sunday, 24 August 2014 at 18:43:36 UTC, Dmitry Olshansky wrote:

24-Aug-2014 22:19, Andrew Godfrey пишет:
The OP and the question of auto-decoding share the same rootproblem:Even though D does a lot better with UTF than other languagesI've used,it still confuses characters with code points somewhat."Element type is
some character" is an example from OP. So clarify for me:
If a programmer makes an array of either 'char' or 'wchar',does that
always, unambiguously, mean a UTF8 or UTF16 code point?
Yes, pedantically - UTF-8 and UTF-16 code _units_. dchar is acodepoint.
E.g. If
interoperating with C code, they will never make the mistakeof using
these types for a non-string byte/word array?
char != byte, and compiler will reject pointer and arrayassignments of byte* to char*, ubyte[] to char[] etc. Valuesthemselves are convertible, so would work with implicitconversion.
If and only if this is true, then D has done well and I'munafraid of
duck-typing here.


Both your answers are at the level of the compiler/language spec.

Relevant yes, but not complete. E.g. How often will peoplemanually converting a .h file, convert C "const char *" correctlyto either something char-based or something ubyte-based,depending on whether it represents utf-8 code points?

How often will they even know?
With wchar it's probably even worse, because of API's that
use one type but depending on other parameters,
the string elements can be utf-16 code points or
glyph indices.

Re: Relaxing the definition of isSomeString and isNarrowString

Reply via email to