Re: The Case Against Autodecode

Vladimir Panteleev via Digitalmars-d Fri, 03 Jun 2016 03:16:41 -0700

On Friday, 3 June 2016 at 10:05:11 UTC, Walter Bright wrote:

On 6/3/2016 1:05 AM, H. S. Teoh via Digitalmars-d wrote:
However, this
meant that some precomposed characters were "redundant": they
represented character + diacritic combinations that couldequally wellbe expressed separately. Normalization was the inevitableconsequence.
It is not inevitable. Simply disallow the 2 codepoint sequences- the single one has to be used instead.
There is precedent. Some characters can be encoded with morethan one UTF-8 sequence, and the longer sequences were declaredinvalid. Simple.
I.e. have the normalization up front when the text is createdrather than everywhere else.

I don't think it would work (or at least, the analogy doesn'thold). It would mean that you can't add new precompositedcharacters, because that means that previously valid sequencesare now invalid.

Re: The Case Against Autodecode

Reply via email to