> However, in practical terms, the indexable string elements are components, 
> not codepoints.
> 
> It seems to me the single hardest thing to come to grips with when newly 
> approaching NSString is understanding that 'unichar's (and "characters" in 
> the sense of [characterAtIndex:]) *aren't* codepoints. In fact, AFAICT the 
> only way to *represent* a codepoint in NSString is indirectly, as a 
> single-Unicode-character string where you happen to know from general Unicode 
> knowledge that the character is represented uniquely by a single codepoint.
> 
> Unfortunately, I believe, most people newly arrived at NSString will assume 
> that 'unichar'/[characterAtIndex:] is a Unicode codepoint***, and have no 
> reason to study the documentation carefully enough to see that this is a 
> false assumption.
> 
> 
> 
> 
> *** I think that's what they'd assume if they know a fair bit about Unicode. 
> If they know less than that, they'll likely assume 'unichar' is a Unicode 
> character, which is even further from the 

I believe this stems from a period in history when the unicode group believed 
that they'd be able to fit all practical scripts into 65536 code points. Which 
meant you could get away with all kinds of assumptions like 16 bit types and 
UCS-2. 

As it became clear that wasn't going to be enough code points the additional 
planes were defined and ucs2 fell out of favor being replaced by UTF16 which 
can model the higher planes. 

Both Java's String and Objective C's NSString have these sorts of API speed 
bumps because I think they were originally created in the ucs2 era where a 
16bit code point was effectively a character and the mapping was simple. UTF16 
was retrofitted over the existing API. 

I actually built a Category for NSString that gives it methods that return 
UTF32 chars by handling surrogate pairs. 
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to