Re: New ABI NSConstantString

Richard Frith-Macdonald Sat, 07 Apr 2018 02:49:53 -0700


> On 7 Apr 2018, at 10:21, Ivan Vučica <[email protected]> wrote:
> 
> On Sat, Apr 7, 2018, 09:50 David Chisnall <[email protected]> wrote:
> 
> 
> My current plan is to make the format support ASCII, UTF-8, UTF-16, and 
> UTF-32, but only generate ASCII and UTF-16 in the compiler and then decide 
> later if we want to support generating UTF-8 and UTF-32.  I also won’t 
> initialise the hash in the compiler initially, until we’ve decided a bit more 
> what the hash should be.
> 
> Emojis don't fit UTF-16. Even if one dismisses CJK, ancient scripts etc, 
> constant strings are not absolutely unlikely to contain emojis.
> 
> Not supporting UTF-8 for internal storage may be reasonable, but not 
> supporting UTF-32 for strings that require it seems like a bug.


Everything fits in UTF-16 (or UTF-8 for that matter).  However it's true that 
many/most emojis don't fit in a *single* 16bit value and require two UTF-16 (or 
multiple 8bit UTF-8 values) to encode them.
Since the NSString APIs assume a 16bit character width, that means an emoji 
will generally be treated as two characters as far as they are concerned, but 
that's not really a problem and current gnustep-base can/does work for emojis 
(for instance, sending UTF16 to mobile phones).


_______________________________________________
Gnustep-dev mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/gnustep-dev

Re: New ABI NSConstantString

Reply via email to