Mark Davis 🍱️ <mark at macchiato dot com> wrote: >> TUS 8.0 Chapter 3 C6: "A process shall not assume that the >> interpretations of two canonical-equivalent character sequences are >> distinct." > > A compiler will take source code containing String x="á"; and compile > it to a certain binary. If that same source code is NFD'd, the > compiler will produce a different result. > > Do you really think that such compiler is not compliant to Unicode?? > If so, then we should add some more clarifications around C6.
I agree. The word "interpretations" in C6 can't have been intended to include the interpretation of code points qua code points. That would make a great many internal processes impossible. I think of C6 as meaning that spell-checkers, for example, should not treat José (NFC, four code points) and José (NFD, five code points) as separate entries. -- Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸

