Leif Ekblad wrote:
IMO, I wouldn't support wide-character (UnicodeString) strings for anything new. In the beginning the wide-character string had the advantage of being able to represent all characters with 2 bytes, but this is no longer the case. I would switch to UTF-8
instead and keep characters 1 byte long. A switch to UTF-8 only affects a
small amount of the code-base, and doesn't break string references.

UTF-8 is fine for external representation, and for code that's near it. After all, it's merely a form of compression in the same way that HTTP etc. uses compression for content.

I think your point about two bytes now being insufficient to represent all possible Unicode codepoints is valid, but since things like expression parsing are made much more efficient by being able to iterate an array that's an argument for moving to a wider internal representation- not a narrower one.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to