On 9/17/2013 10:54 PM, Asmus Freytag wrote:
On 9/17/2013 8:40 PM, Philippe Verdy wrote:

    In what way does UTF-16 "use" surrogate code /points/? An
    encoding form is a mapping. Let's look at this mapping:

      * One _inputs_ scalar values (not surrogate code points).

In fact the input is one code point.

Then only if that code point has a scalar value (this may be tested or not by the application), the rest of the algorithm applies.

Thanks for providing some needed clarity.
No:

1. According to the Glossary, an encoding form maps scalar values to
   sequences of code units. Therefore, such an input validity check
   isn't part of the encoding form, or at least not the encoding form
   proper.
2. That still doesn't mean surrogates are "used by UTF-16", like the
   Glossary claims. The validity check you're quoting from Philippe's
   message would (if performed) be equally relevant to all encoding
   forms; thus it wouldn't be UTF-16-specific.


Stephan

Reply via email to