Thank you for bringing this up! Swift String has a lot of expressivity gaps 
we’re trying to tackle in Swift 5. I think that both language and library 
support for flexible matching and transformation is needed, likely through a 
regex-like construct. Libraries and prototypes like this help drive the 
discussion. Do you have a package that implements this?

Ben’s email highlights a lot of the design considerations that need to be 
thought through. We’ll likely be iterating over designs on swift-evolution for 
the various components, but we also need a high level vision for this. I’m 
working on a new version of the String Manifesto that incorporates lessons 
learned from the implementation of Swift 4 while fleshing out a vision for 
Swift 5 and beyond.

Part of that vision includes a declarative way for programmers to express 
string processing and transformation cleanly in Swift. Perl 6 is the gold 
standard for string processing in programming languages. Any insights shown in 
Perl 6’s design[1] is likely to be beneficial for Swift. I don’t think we 
should just cargo-cult its constructs verbatim, but our eventual design is 
likely to be influenced by it. Swift’s tendency for clarity over terseness 
means that these concepts might look different on the surface, and Swift’s type 
system is very different from Perl 6’s, but we should certainly steal the best 
ideas. Good languages copy; great languages steal!

This design discussion will likely unfold over the coming months. In the mean 
time, I’m interested in playing with your approach and the kinds of problems it 
solves cleanly and effectively. Could you link to an SPM package implementing 
this? Do you have example code using it?


> On Aug 10, 2017, at 2:58 AM, Ben Cohen via swift-evolution 
> <> wrote:
> Hi Joshua,
> Thanks for bringing this topic up. It may help to outline why regular 
> expressions were deferred until Swift 5. The work to create a regular 
> expression type itself, and even to add a regex literal protocol, is fairly 
> straightforward and probably could have been done in the Swift 4 timeframe 
> (maybe by making NSRegularExpression a value type and refreshing its API), 
> but there are other design aspects that we need to explore prior to that, 
> many of which have compiler impact, or would need to factor in the underlying 
> representation of string in order to be efficient, in order to make 
> Swift-native regular expressions really great.
> Right now, the top priority for Swift 5 is ABI stability, and String plays a 
> fairly large part in that. We need to finalize the size of String (currently 
> 3 words, but 2 is the likely final size), implement the small string 
> optimization, and decide which parts of String need to be fragile/inlineable 
> and which need to be resilient.
> Since ABI stability takes priority, this give us time in the mean-time to 
> consider the broader design questions of what Swift-native regular 
> expressions would look like. These design considerations probably need to 
> come ahead of designing an API for specific types like a regex, matches etc.
> Some examples of these kind of questions include:
> What syntax changes to the usual form of regexes should be considered? For 
> example, what should “.” mean in regular expressions? It would be out of 
> keeping for it to mean a code unit, when applied to Swift.String. 
> Alternatively, should regular expressions work on all the views? In which 
> case, “.” could mean a grapheme when applied to String, but a code unit when 
> applied to the UTF16 view.
> How can let bindings work on capture groups i.e. rather than having named 
> capture groups like in Perl, can we bind directly to variables in switches 
> etc? Could types conform to a RegularExpressionCapturable that would consume 
> part of the string and initialize self, so that you could capture not just 
> Substring but any type? You can’t express this in the language today, and 
> would need compiler integration. This integration could start more hard-coded 
> in order to deliver value in the Swift 5 release, but hopefully be 
> generalizable in later releases.
> What other std lib APIs should be changed once we have regular expressions? 
> For example, split ought to work with regexes e.g. let words = 
> sentence.split(separator: /\w+/). How can this generalize to Collections 
> where possible? E.g. [1,2,3,4].index(of: [2,3]) ought to work just as 
> “abcd”.index(of: /bc/) should.
> On Aug 10, 2017, at 7:24 AM, Joshua Alvarado via swift-evolution 
> < <>> wrote:
>> Hey everyone,
>> I would like to pitch an implementation of Regex in Swift and gather all of 
>> your thoughts.
>> Motivation:
>> In the String Manifesto for Swift 4, addressing regular expressions was not 
>> in scope. Swift 5 would be a more fitting version to address the 
>> implementation of Regex in Swift. NSRegularExpression is a suitable solution 
>> for pattern matching but the API is in unfitting for the future direction of 
>> Swift.
>> Implementation:
>> The Regular expression API will be implemented by a Regex structure object 
>> which is a regular expression that you can apply to Unicode strings. The 
>> Regex struct will conform to the RegexProtocol, which is a type that can 
>> represent a regular expression. ExpressibleByRegexLiteral will be used to 
>> initialize a regex literal creating an easy to use syntax and a Match 
>> structure will be used to represent a match found with a Regex.
>> Draft of implementation:
>> protocol ExpressibleByRegexLiteral {
>>     associatedtype RegexLiteralType
>>     init(regexLiteral value: Self.RegexLiteralType)
>> }
>> // Structure of information about a match of regex on a string
>> struct Match {
>>     var regex: Regex
>>     var start: String.Index
>>     var end: String.Index
>> }
>> protocol RegexProtocol {
>>     init(pattern: String) throws
>>     var pattern: String { get } // string representation of the pattern
>>     func search(string: String) -> Bool // used to check if a match is found 
>> at all in the string
>>     func match(string: String) -> [Match] // returns an array of all the 
>> matches
>>     func match(string: String, using: ((Match) -> Void)) // enmuerate over 
>> matches
>> }
>> struct Regex: RegexProtocol {
>>     init(pattern: Regex, options: Regex.Options)
>>     let options: [Regex.Options]
>>     static let word: Regex // \w
>>     // other useful regexes can be added as well
>> }
>> // Examples
>> let regex = \[a-zA-Z]+\
>> let matches = regex.match("Matching words in text.")
>> for match in matches {
>>     print("Found a match at in string at \(match.start) to \(match.end)")
>> }
>> let helloStr = "Hello world"
>> Regex.word.match(helloStr) { match in
>>     print("Matched \(helloStr[match.start..<match.end])")
>> }
>> Of course this is a scratch implementation I made but it is to open 
>> discussion on the topic. I feel the Regex struct itself will need more 
>> methods and variables such as for flags and number of groups. Please provide 
>> feedback with improvements to the code, concerns on the topic, or just open 
>> up discussion. Thank you!
>> Joshua Alvarado
>> <>
>> _______________________________________________
>> swift-evolution mailing list
>> <>
>> <>
> _______________________________________________
> swift-evolution mailing list

swift-evolution mailing list

Reply via email to