> A good Unicode string in a programming language Yes, that would be great, no question. It isn't, however, the case in most programming languages (measured by the amount of software written in them). The original question that started these threads was how to handle isolated surrogates. If you are lucky enough to be only ever using programming languages that prevent that from ever happening, then the question is moot for you. If you're not, the question is relevant.
Mark On Tue, Oct 20, 2015 at 6:47 PM, Daniel Bünzli <[email protected]> wrote: > Le mercredi, 21 octobre 2015 à 02:23, Mark Davis ☕️ a écrit : > > But more fundamentally, there may not be "excuses" for such software, > but it happens anyway. Pretending it doesn't, makes for unhappy customers. > For example, you don't want to be throwing an exception when one is > encountered, when that could cause an app to fail. > > It does happen at the input layer but it doesn't make any sense to bother > the programmers with this once the IO boundary has been crossed and > decoding errors handled. A good Unicode string in a programming language > should at least operate at the scalar value level and these notions of > Unicode n-bit strings should definitively be killed (maybe it would have > inspired hopeless designers of recent programming languages to actually > make better choices on that topic). > > Best, > > Daniel > > >

