> On 7 Apr 2015, at 03:42, Quincey Morris <quinceymor...@rivergatesoftware.com> 
> wrote:
> 
> On Apr 6, 2015, at 12:29 , Greg Parker <gpar...@apple.com> wrote:
>> 
>> my understanding is that when Cocoa says "character" it usually means 
>> "UTF-16 code unit". @"🚲".length == 2, for example. Cocoa's string API 
>> designed when Unicode was still a true 16-bit character set.
> 
> I would have said so, too, except that NSCharacterSet has this 
> ‘longCharacterIsMember: (UTF32Char)’ API, which seems inexplicable if the 
> parameter is a UTF-16 code unit, since that’s what ‘characterIsMember: 
> (unichar)’ is apparently for.

Well, it is really quite simple:
NSString (and others) means by "character": unsigned short in Utf-16 
representation.
But LongCharacter means: "Unicode code-point".

Both definitions were the same in Unicode 1.0 (up to about 1996) when Unicode 
was 16 bits only. Starting with 2.0 it became 21 bits.
They are still the same for code-points below 0x10 000, i.e. Plane 0, or Basic 
Multilingual Plane.

Kind regards,

Gerriet.


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to