On 8/2/2012 11:42 AM, Andrei Alexandrescu wrote:
I like a lot this idea of an "minimally decoded" character that's isomorphic with UTF-32 but much cheaper to extract. (We could use ulong if they add 5- and 6-byte characters). I wonder if people came up with this and gave it a name. If not, I'd say we call such a number an "olsh".
Yeah, it's too bad the inventors of UTF8 didn't think of this.
