On Apr 6, 2015, at 09:19 , Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote: > > Where is my bicycle gone? What am I doing wrong?
Before this thread heads further into outer space… I suspect it [NSCharacterSet] is just broken. Look here, for example: http://stackoverflow.com/questions/23000812/creating-nscharacterset-with-unicode-smp-entries-testing-membership-is-this <http://stackoverflow.com/questions/23000812/creating-nscharacterset-with-unicode-smp-entries-testing-membership-is-this> The problem is that it’s unclear whether the “characters” in NSCharacterSet are internally UTF-16 code units, UTF-32 code units, Unicode code points, or something else. According to the NSCharacterSet documentation: > "An NSCharacterSet object represents a set of Unicode-compliant characters.” and: > "The NSCharacterSet class declares the programmatic interface for an object > that manages a set of Unicode characters (see the NSString class cluster > specification for information on Unicode).” According the NSString documentation: > "A string object presents itself as an array of Unicode characters (Unicode > is a registered trademark of Unicode, Inc.). You can determine how many > characters a string object contains with the length method and can retrieve a > specific character with the characterAtIndex: method.” Working backwards, we know that the characters that are counted by -[NSString length]’ are UTF-16 code units, so this all *possibly* implies that NSCharacterSet characters are UTF-16 code units, too. Plus, back in NSCharacterSet documentation: > "NSCharacterSet’s principal primitive method, characterIsMember:, provides > the basis for all other instance methods in its interface.” If that’s true, ‘longCharacterIsMember:’ is pretty much screwed. Perhaps the NSCharacterSet documentation is just wrong. Or perhaps, when the API was enhanced in 10.2 (see: http://www.cocoabuilder.com/archive/cocoa/73297-working-with-32-bit-unicode-nsstring-stringwithutf32string-const-utf32char-bytes-needed.html <http://www.cocoabuilder.com/archive/cocoa/73297-working-with-32-bit-unicode-nsstring-stringwithutf32string-const-utf32char-bytes-needed.html>, for some tantalizing hints about NSCharacterSet), the implementation was a hack that works somehow but isn’t documented. I don’t think you’re going to get any definitive answer except directly from Apple. A suggestion, though: Try building your character set using ‘characterSetWithRange:’ and/or the NSMutableCharacterSet methods that add ranges, instead of using NSStrings. Maybe NSCharacterSet really is UTF-32-based, but not — for code compatibility reasons — when using NSStrings explicitly. _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com