I want to start by saying great work, Ben and Dave. I know you've put a lot of time into this (and humored me in several Apple-internal discussions) and what's here looks like a great overhaul of String, balancing several tricky constraints. I do want to record some comments on specific parts of the proposal that I still have concerns about, but as usual you can of course take these with a grain of salt.

To ease the pain of type mismatches, Substring should be a subtype of String in the same way that Int is a subtype of Optional<Int>. This would give users an implicit conversion from Substring to String, as well as the usual implicit conversions such as [Substring] to [String] that other subtype relationships receive.

I'm concerned about this for two reasons: first, because even with the comparison with Optional boxing, this is basically adding arbitrary conversions to the language, which we took out in part because of their disastrous effect on the performance of the type checker; and second, because this one in particular makes an O(N) copy operation very subtle (though admittedly one you incur all the time today, with no opt-out). A possible mitigation for the first issue would be to restrict the implicit conversion to arguments, like we do for inout-to-pointer conversions. It's still putting implicit conversions back into the language, though.


Therefore, APIs that operate on an NSString/NSRange pair should be imported without the NSRange argument. The Objective-C importer should be changed to give these APIs special treatment so that when a Substring is passed, instead of being converted to a String, the full NSString and range are passed to the Objective-C method, thereby avoiding a copy.

I'm very skeptical about assuming that a method that takes an NSString and an NSRange automatically means to apply that NSRange to the NSString, but fortunately it may not be much of an issue in practice. A quick grep of Foundation and AppKit turned up only 45 methods that took both an NSRange and an NSString *, clustered on a small number of classes; in less than half of these cases would the transformation to Substring actually be valid (the other ranges refer to some data in the receiver rather than the argument). I've attached these results below, if you're interested.

(Note that I left out methods on NSString itself, since those are manually bridged to String, but there aren't actually too many of those either. "Foundation and AppKit" also isn't exhaustive, of course.)

-Foundation.framework/Headers/NSAttributedString.h:- 
(void)replaceCharactersInRange:(NSRange)range withString:(NSString *)str;
-Foundation.framework/Headers/NSAttributedString.h:- (nullable 
id)attribute:(NSString *)attrName atIndex:(NSUInteger)location 
longestEffectiveRange:(nullable NSRangePointer)range 
inRange:(NSRange)rangeLimit;
-Foundation.framework/Headers/NSAttributedString.h:- 
(void)enumerateAttribute:(NSString *)attrName inRange:(NSRange)enumerationRange 
options:(NSAttributedStringEnumerationOptions)opts usingBlock:(void 
(NS_NOESCAPE ^)(id _Nullable value, NSRange range, BOOL *stop))block 
NS_AVAILABLE(10_6, 4_0);
-Foundation.framework/Headers/NSAttributedString.h:- 
(void)addAttribute:(NSString *)name value:(id)value range:(NSRange)range;
-Foundation.framework/Headers/NSAttributedString.h:- 
(void)removeAttribute:(NSString *)name range:(NSRange)range;
+Foundation.framework/Headers/NSFormatter.h:- 
(BOOL)isPartialStringValid:(NSString * _Nonnull * _Nonnull)partialStringPtr 
proposedSelectedRange:(nullable NSRangePointer)proposedSelRangePtr 
originalString:(NSString *)origString 
originalSelectedRange:(NSRange)origSelRange errorDescription:(NSString * 
_Nullable * _Nullable)error;
-Foundation.framework/Headers/NSLinguisticTagger.h:- 
(void)enumerateTagsInRange:(NSRange)range scheme:(NSString *)tagScheme 
options:(NSLinguisticTaggerOptions)opts usingBlock:(void (NS_NOESCAPE 
^)(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop))block 
NS_AVAILABLE(10_7, 5_0);
-Foundation.framework/Headers/NSLinguisticTagger.h:- (NSArray<NSString *> 
*)tagsInRange:(NSRange)range scheme:(NSString *)tagScheme 
options:(NSLinguisticTaggerOptions)opts tokenRanges:(NSArray<NSValue *> * 
_Nullable * _Nullable)tokenRanges NS_AVAILABLE(10_7, 5_0);
-Foundation.framework/Headers/NSLinguisticTagger.h:- (NSArray<NSString *> 
*)linguisticTagsInRange:(NSRange)range scheme:(NSString *)tagScheme 
options:(NSLinguisticTaggerOptions)opts orthography:(nullable NSOrthography 
*)orthography tokenRanges:(NSArray<NSValue *> * _Nullable * 
_Nullable)tokenRanges NS_AVAILABLE(10_7, 5_0);
-Foundation.framework/Headers/NSLinguisticTagger.h:- 
(void)enumerateLinguisticTagsInRange:(NSRange)range scheme:(NSString 
*)tagScheme options:(NSLinguisticTaggerOptions)opts orthography:(nullable 
NSOrthography *)orthography usingBlock:(void (NS_NOESCAPE ^)(NSString *tag, 
NSRange tokenRange, NSRange sentenceRange, BOOL *stop))block NS_AVAILABLE(10_7, 
5_0);
+Foundation.framework/Headers/NSRegularExpression.h:- 
(void)enumerateMatchesInString:(NSString *)string 
options:(NSMatchingOptions)options range:(NSRange)range usingBlock:(void 
(NS_NOESCAPE ^)(NSTextCheckingResult * _Nullable result, NSMatchingFlags flags, 
BOOL *stop))block;
+Foundation.framework/Headers/NSRegularExpression.h:- 
(NSArray<NSTextCheckingResult *> *)matchesInString:(NSString *)string 
options:(NSMatchingOptions)options range:(NSRange)range;
+Foundation.framework/Headers/NSRegularExpression.h:- 
(NSUInteger)numberOfMatchesInString:(NSString *)string 
options:(NSMatchingOptions)options range:(NSRange)range;
+Foundation.framework/Headers/NSRegularExpression.h:- (nullable 
NSTextCheckingResult *)firstMatchInString:(NSString *)string 
options:(NSMatchingOptions)options range:(NSRange)range;
+Foundation.framework/Headers/NSRegularExpression.h:- 
(NSRange)rangeOfFirstMatchInString:(NSString *)string 
options:(NSMatchingOptions)options range:(NSRange)range;
+Foundation.framework/Headers/NSRegularExpression.h:- (NSString 
*)stringByReplacingMatchesInString:(NSString *)string 
options:(NSMatchingOptions)options range:(NSRange)range withTemplate:(NSString 
*)templ;
-Foundation.framework/Headers/NSRegularExpression.h:- 
(NSUInteger)replaceMatchesInString:(NSMutableString *)string 
options:(NSMatchingOptions)options range:(NSRange)range withTemplate:(NSString 
*)templ;
+Foundation.framework/Headers/NSSpellServer.h:- (nullable NSArray<NSString *> 
*)spellServer:(NSSpellServer *)sender 
suggestCompletionsForPartialWordRange:(NSRange)range inString:(NSString 
*)string language:(NSString *)language;
-Foundation.framework/Headers/NSTextCheckingResult.h:+ (NSTextCheckingResult 
*)quoteCheckingResultWithRange:(NSRange)range replacementString:(NSString 
*)replacementString;
-Foundation.framework/Headers/NSTextCheckingResult.h:+ (NSTextCheckingResult 
*)dashCheckingResultWithRange:(NSRange)range replacementString:(NSString 
*)replacementString;
-Foundation.framework/Headers/NSTextCheckingResult.h:+ (NSTextCheckingResult 
*)replacementCheckingResultWithRange:(NSRange)range replacementString:(NSString 
*)replacementString;
-Foundation.framework/Headers/NSTextCheckingResult.h:+ (NSTextCheckingResult 
*)correctionCheckingResultWithRange:(NSRange)range replacementString:(NSString 
*)replacementString;
-Foundation.framework/Headers/NSTextCheckingResult.h:+ (NSTextCheckingResult 
*)correctionCheckingResultWithRange:(NSRange)range replacementString:(NSString 
*)replacementString alternativeStrings:(NSArray<NSString *> 
*)alternativeStrings     NS_AVAILABLE(10_9, 7_0);
-Foundation.framework/Headers/NSTextCheckingResult.h:+ (NSTextCheckingResult 
*)phoneNumberCheckingResultWithRange:(NSRange)range phoneNumber:(NSString 
*)phoneNumber             NS_AVAILABLE(10_7, 4_0);
+AppKit.framework/Headers/NSCandidateListTouchBarItem.h:- 
(void)setCandidates:(NSArray<CandidateType> *)candidates 
forSelectedRange:(NSRange)selectedRange inString:(nullable NSString 
*)originalString;
-AppKit.framework/Headers/NSLayoutManager.h:- 
(void)removeTemporaryAttribute:(NSString *)attrName 
forCharacterRange:(NSRange)charRange;
-AppKit.framework/Headers/NSLayoutManager.h:- 
(void)addTemporaryAttribute:(NSString *)attrName value:(id)value 
forCharacterRange:(NSRange)charRange NS_AVAILABLE_MAC(10_5);
+AppKit.framework/Headers/NSSpeechSynthesizer.h:- 
(void)speechSynthesizer:(NSSpeechSynthesizer *)sender 
willSpeakWord:(NSRange)characterRange ofString:(NSString *)string;
+AppKit.framework/Headers/NSSpellChecker.h:- (NSArray<NSTextCheckingResult *> 
*)checkString:(NSString *)stringToCheck range:(NSRange)range 
types:(NSTextCheckingTypes)checkingTypes options:(nullable 
NSDictionary<NSString *, id> *)options inSpellDocumentWithTag:(NSInteger)tag 
orthography:(NSOrthography * __nullable * __nullable)orthography 
wordCount:(nullable NSInteger *)wordCount NS_AVAILABLE_MAC(10_6);
+AppKit.framework/Headers/NSSpellChecker.h:- 
(NSInteger)requestCheckingOfString:(NSString *)stringToCheck 
range:(NSRange)range types:(NSTextCheckingTypes)checkingTypes options:(nullable 
NSDictionary<NSString *, id> *)options inSpellDocumentWithTag:(NSInteger)tag 
completionHandler:(void (^ __nullable)(NSInteger sequenceNumber, 
NSArray<NSTextCheckingResult *> *results, NSOrthography *orthography, NSInteger 
wordCount))completionHandler NS_AVAILABLE_MAC(10_6);
+AppKit.framework/Headers/NSSpellChecker.h:- 
(NSInteger)requestCandidatesForSelectedRange:(NSRange)selectedRange 
inString:(NSString *)stringToCheck types:(NSTextCheckingTypes)checkingTypes 
options:(nullable NSDictionary<NSString *, id> *)options 
inSpellDocumentWithTag:(NSInteger)tag completionHandler:(void (^ 
__nullable)(NSInteger sequenceNumber, NSArray<NSTextCheckingResult *> 
*candidates))completionHandler NS_AVAILABLE_MAC(10_12_2);
+AppKit.framework/Headers/NSSpellChecker.h:- (nullable NSArray<NSString *> 
*)guessesForWordRange:(NSRange)range inString:(NSString *)string 
language:(nullable NSString *)language inSpellDocumentWithTag:(NSInteger)tag 
NS_AVAILABLE_MAC(10_6);
+AppKit.framework/Headers/NSSpellChecker.h:- (nullable NSString 
*)correctionForWordRange:(NSRange)range inString:(NSString *)string 
language:(NSString *)language inSpellDocumentWithTag:(NSInteger)tag 
NS_AVAILABLE_MAC(10_7);
+AppKit.framework/Headers/NSSpellChecker.h:- (nullable NSArray<NSString *> 
*)completionsForPartialWordRange:(NSRange)range inString:(NSString *)string 
language:(nullable NSString *)language inSpellDocumentWithTag:(NSInteger)tag;
+AppKit.framework/Headers/NSSpellChecker.h:- (nullable NSString 
*)languageForWordRange:(NSRange)range inString:(NSString *)string 
orthography:(nullable NSOrthography *)orthography NS_AVAILABLE_MAC(10_7);
-AppKit.framework/Headers/NSText.h:- 
(void)replaceCharactersInRange:(NSRange)range withString:(NSString *)string;
-AppKit.framework/Headers/NSTextFinder.h:- 
(void)replaceCharactersInRange:(NSRange)range withString:(NSString *)string;
-AppKit.framework/Headers/NSTextStorage.h: - 
(void)replaceCharactersInRange:(NSRange)range withString:(NSString *)str;
-AppKit.framework/Headers/NSTextView.h:- 
(BOOL)shouldChangeTextInRange:(NSRange)affectedCharRange 
replacementString:(nullable NSString *)replacementString;
-AppKit.framework/Headers/NSTextView.h:- (void)insertCompletion:(NSString 
*)word forPartialWordRange:(NSRange)charRange movement:(NSInteger)movement 
isFinal:(BOOL)flag;
-AppKit.framework/Headers/NSTextView.h:- (void)smartInsertForString:(NSString 
*)pasteString replacingRange:(NSRange)charRangeToReplace beforeString:(NSString 
* __nullable * __nullable)beforeString afterString:(NSString * __nullable * 
__nullable)afterString;
-AppKit.framework/Headers/NSTextView.h:- (nullable NSString 
*)smartInsertBeforeStringForString:(NSString *)pasteString 
replacingRange:(NSRange)charRangeToReplace;
-AppKit.framework/Headers/NSTextView.h:- (nullable NSString 
*)smartInsertAfterStringForString:(NSString *)pasteString 
replacingRange:(NSRange)charRangeToReplace;
-AppKit.framework/Headers/NSTextView.h:- (BOOL)textView:(NSTextView *)textView 
shouldChangeTextInRange:(NSRange)affectedCharRange replacementString:(nullable 
NSString *)replacementString;
-AppKit.framework/Headers/NSUserInterfaceItemSearching.h:- 
(BOOL)searchString:(NSString *)searchString inUserInterfaceItemString:(NSString 
*)stringToSearch searchRange:(NSRange)searchRange foundRange:(nullable NSRange 
*)foundRange NS_AVAILABLE_MAC(10_6);


  associatedtype ExtendedASCII 
    : BidirectionalCollection where Element == UInt32
  var extendedASCII: ExtendedASCII { get }

This isn't a criticism, just a question: why constrain the collection to UInt32 elements? It seems unfortunate that the most common buffer types (UTF-16, UTF-8, or "unparsed bytes") can't just be passed as-is.


  var unicodeScalars: UnicodeScalars { get }

Typo: this appears twice in the Unicode protocol.


We should represent these aspects as orthogonal, composable components, abstracting pattern matchers into a protocol like this one, that can allow us to define logical operations once, without introducing overloads, and massively reducing API surface area.

I'm still uneasy about the performance of generalized matching operations built on top of Collection. I'm not sure we can reasonably expect the compiler to lower that all down to bulk memory accesses. That's at least only one part of the manifesto, though.


clipboard.write(s.endIndex.codeUnitOffset)
let offset = clipboard.read(Int.self)
let i = String.Index(codeUnitOffset: offset)

Sorry, what is 'clipboard'? I think I'm missing something in this section—it's talking about how it's important to have a stable representation for string positions across the different index types, but the code sample doesn't directly connect for me.


Our support for interoperation with nul-terminated C strings is scattered and incoherent, with 6 ways to transform a C string into a String and four ways to do the inverse. These APIs should be replaced with the following

This proposal doesn't plan to remove the implicit, scoped, argument-only String-to-UnsafePointer<CChar> conversion, does it? (Is that one of the 6?)


To address this need, we can build models of the Unicode protocol that encode representation information into the type, such as NFCNormalizedUTF16String.

Not an urgent thought, but I wonder if these alternate representations really belong in the stdlib, as opposed to some auxiliary library like "SwiftStrings" or "CoreStrings", and if so whether that's still part of the standard Swift distribution or just a plain old SwiftPM package that happens to be maintained by Apple (and maybe comes preincluded with Xcode for now).


That's all I have. Again, great work, and godspeed.

Jordan

(unfortunately I am not in charge of implementing any of the features you need for this, at least as far as I know)



On Jan 19, 2017, at 18:56, Ben Cohen via swift-evolution <[email protected]> wrote:

Hi all,

Below is our take on a design manifesto for Strings in Swift 4 and beyond.

Probably best read in rendered markdown on GitHub:
https://github.com/apple/swift/blob/master/docs/StringManifesto.md

We’re eager to hear everyone’s thoughts.

Regards,
Ben and Dave

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to