On Wed, Sep 28, 2016 at 10:23 PM, Charles Srstka via swift-evolution < [email protected]> wrote:
> On Sep 28, 2016, at 9:57 PM, Erica Sadun via swift-evolution < > [email protected]> wrote: > > > D'erp. I missed that. And that's an unambiguous answer. > > So let me move on to part B of the pitch: I think CharacterSets are broken. > > Xiaodi Wu: "isn't the problem you're presenting really an argument that > the type should be fleshed out to handle characters (grapheme clusters) > containing more than one Unicode scalar?" > > > It seems that it already does handle such characters: > > (done in Objective-C so we can log the length of the range as a count of > UTF-16 code units) > > #import <Foundation/Foundation.h> > > int main(int argc, char *argv[]) { > @autoreleasepool { > NSCharacterSet *bikeSet = [NSCharacterSet > characterSetWithCharactersInString:@"🚲"]; > NSString *str = @"foo🚲bar"; > > > NSRange range = [str rangeOfCharacterFromSet:bikeSet]; > > > NSLog(@"location: %lu length: %lu", range.location, range.length); > } > } > > - - - - - - - > > *2016-09-28 22:20:00.622471 test[15577:2433912] location: 3 length: 2* > *Program ended with exit code: 0* > > - - - - - - - > > As we can see, the character from the set is recognized as consisting of > two code units. There are a few bugs in the system, though. See the > cocoa-dev thread “Where is my bicycle?” from about a year ago: > http://prod.lists.apple.com/archives/cocoa-dev/2015/Apr/msg00074.html > The bike emoji might be two code units, but it is one Unicode scalar (U+1F6B2). However, the Canadian flag emoji, for instance, is two Unicode scalars (U+1F1E8 U+1F1E6) but nonetheless one character. Charles > > > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution > >
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
