On Mon, Jan 7, 2013 at 10:48 AM, Doug Ewell <[email protected]> wrote:
> Markus Scherer <markus dot icu at gmail dot com> wrote: > > > Also, we commonly read code points from 16-bit Unicode strings, and > > unpaired surrogates are returned as themselves and treated as such > > (e.g., in collation). That would not be well-formed UTF-16, but it's > > generally harmless in text processing. > > But still non-conformant. > Not really, that's why there is a definition of a 16-bit Unicode string in the standard. markus

