Re: The Case Against Autodecode

John Colvin via Digitalmars-d Thu, 02 Jun 2016 15:31:38 -0700

On Thursday, 2 June 2016 at 20:27:27 UTC, Walter Bright wrote:

On 6/2/2016 12:34 PM, deadalnix wrote:
On Thursday, 2 June 2016 at 19:05:44 UTC, Andrei Alexandrescuwrote:
Pretty much everything. Consider s and s1 string variableswith possibly
different encodings (UTF8/UTF16).
* s.all!(c => c == 'ö') works only with autodecoding. Itreturns always false
without.
False. Many characters can be represented by differentsequences of codepoints.For instance, ê can be ê as one codepoint or ^ as a modifierfollowed by e. ö is
one such character.
There are 3 levels of Unicode support. What Andrei is talkingabout is Level 1.
http://unicode.org/reports/tr18/tr18-5.1.html
I wonder what rationale there is for Unicode to have twodifferent sequences of codepoints be treated as the same. It'smadness.

There are languages that make heavy use of diacritics, oftenseveral on a single "character". Hebrew is a good example. Shouldthere be only one valid ordering of any given set of diacriticson any given character? It's an interesting idea, but it's nothow things are.

Re: The Case Against Autodecode

Reply via email to