CharacterSet is a Foundation value type. It was a subject of the following proposal:
https://github.com/apple/swift-evolution/blob/master/proposals/0069-swift-mutability-for-foundation.md We might be able improve on the implementation, but I don't think re-arguing the name is an option. On Wed, Sep 28, 2016 at 11:59 PM Jay Abbott <[email protected]> wrote: > > Yes - this is totally confusing. CharacterSet and Set<Character> are > completely different things with different semantics. > > I don't know the history, but is CharacterSet simply to have a Swift > equivalent of NSCharacterSet? That seems to be what it is, but since Swift > redefined characters in a better way, this should be removed or called > something else to avoid confusion. You shouldn't have to qualify what you > mean by 'character' in a type name because it diverges from the definition > in the rest of the language. > > On Thu, 29 Sep 2016 at 04:48 Xiaodi Wu via swift-evolution < > [email protected]> wrote: > >> On Wed, Sep 28, 2016 at 10:34 PM, Xiaodi Wu <[email protected]> wrote: >> >>> On Wed, Sep 28, 2016 at 10:23 PM, Charles Srstka via swift-evolution < >>> [email protected]> wrote: >>> >>>> On Sep 28, 2016, at 9:57 PM, Erica Sadun via swift-evolution < >>>> [email protected]> wrote: >>>> >>>> >>>> D'erp. I missed that. And that's an unambiguous answer. >>>> >>>> So let me move on to part B of the pitch: I think CharacterSets are >>>> broken. >>>> >>>> Xiaodi Wu: "isn't the problem you're presenting really an argument that >>>> the type should be fleshed out to handle characters (grapheme clusters) >>>> containing more than one Unicode scalar?" >>>> >>>> >>>> It seems that it already does handle such characters: >>>> >>>> (done in Objective-C so we can log the length of the range as a count >>>> of UTF-16 code units) >>>> >>>> #import <Foundation/Foundation.h> >>>> >>>> int main(int argc, char *argv[]) { >>>> @autoreleasepool { >>>> NSCharacterSet *bikeSet = [NSCharacterSet >>>> characterSetWithCharactersInString:@"🚲"]; >>>> NSString *str = @"foo🚲bar"; >>>> >>>> >>>> NSRange range = [str rangeOfCharacterFromSet:bikeSet]; >>>> >>>> >>>> NSLog(@"location: %lu length: %lu", range.location, range. >>>> length); >>>> } >>>> } >>>> >>>> - - - - - - - >>>> >>>> *2016-09-28 22:20:00.622471 test[15577:2433912] location: 3 length: 2* >>>> *Program ended with exit code: 0* >>>> >>>> - - - - - - - >>>> >>>> As we can see, the character from the set is recognized as consisting >>>> of two code units. There are a few bugs in the system, though. See the >>>> cocoa-dev thread “Where is my bicycle?” from about a year ago: >>>> http://prod.lists.apple.com/archives/cocoa-dev/2015/Apr/msg00074.html >>>> >>> >>> The bike emoji might be two code units, but it is one Unicode scalar >>> (U+1F6B2). However, the Canadian flag emoji, for instance, is two Unicode >>> scalars (U+1F1E8 U+1F1E6) but nonetheless one character. >>> >> >> To illustrate in code how CharacterSet doesn't actually handle characters >> made up of multiple Unicode scalars: >> >> ``` >> import Foundation >> >> let str1 = "🇦🇩" >> let first = CharacterSet(charactersIn: str1) // this actually crashes >> corelibs-foundation >> let str2 = "🇦🇺" >> let second = CharacterSet(charactersIn: str2) >> let intersection = first.intersection(second) >> print(intersection.isEmpty) >> // actual output: false >> // obviously, if we were really dealing with characters, the intersection >> should be empty >> ``` >> >> >>> Charles >>>> >>>> >>>> _______________________________________________ >>>> swift-evolution mailing list >>>> [email protected] >>>> https://lists.swift.org/mailman/listinfo/swift-evolution >>>> >>>> >>> _______________________________________________ >> swift-evolution mailing list >> [email protected] >> https://lists.swift.org/mailman/listinfo/swift-evolution >> >
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
