> On Apr 28, 2016, at 2:56 PM, Brent Royal-Gordon via swift-evolution > <[email protected]> wrote: > >> Awesome. Some specific suggestions below, but feel free to iterate in a >> pull request if you prefer that. > > I've adopted these suggestions in some form, though I also ended up rewriting > the explanation of why the feature was designed as it is and fusing it with > material from "Alternatives considered". > > (Still not sure who I should list as a co-author. I'm currently thinking > John, Tyler, and maybe Chris? Who's supposed to go there?)
I haven’t contributed much beyond the initial suggestions, however, that being said I have never been an author on Swift evolution and it would really make my day (if not year, given that Swift is at least in my top 5 favorite things). :) > > Multiline string literals > > Proposal: SE-NNNN > <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md> > Author(s): Brent Royal-Gordon <https://github.com/brentdax> > Status: Second Draft > Review manager: TBD > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#introduction>Introduction > > In Swift 2.2, the only means to insert a newline into a string literal is the > \n escape. String literals specified in this way are generally ugly and > unreadable. We propose a multiline string feature inspired by English > punctuation which is a straightforward extension of our existing string > literals. > > This proposal is one step in a larger plan to improve how string literals > address various challenging use cases. It is not meant to solve all problems > with escaping, nor to serve all use cases involving very long string > literals. See the "Future directions for string literals in general" section > for a sketch of the problems we ultimately want to address and some ideas of > how we might do so. > > Swift-evolution threads: multi-line string literals. (April) > <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160418/015500.html>, > multi-line string literals (December) > <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20151214/002349.html> > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#draft-notes>Draft > Notes > > Removes the comment feature, which was felt to be an unnecessary > complication. This and the backslash feature have been listed as future > directions. > > Loosens the specification of diagnostics, suggesting instead of requiring > fix-its. > > Splits a "Rationale" section out of the "Proposed solution" section. > > Adds extensive discussion of other features which wold combine with this one. > > I've listed only myself as an author because I don't want to put anyone > else's name to a document they haven't seen, but there are others who deserve > to be listed (John Holdsworth at least). Let me know if you think you should > be included. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#motivation>Motivation > > As Swift begins to move into roles beyond app development, code which needs > to generate text becomes a more important use case. Consider, for instance, > generating even a small XML string: > > let xml = "<?xml version=\"1.0\"?>\n<catalog>\n\t<book id=\"bk101\" > empty=\"\">\n\t\t<author>\(author)</author>\n\t</book>\n</catalog>" > The string is practically unreadable, its structure drowned in escapes and > run-together lines; it looks like little more than line noise. We can improve > its readability somewhat by concatenating separate strings for each line and > using real tabs instead of \t escapes: > > let xml = "<?xml version=\"1.0\"?>\n" + > "<catalog>\n" + > " <book id=\"bk101\" empty=\"\">\n" + > " <author>\(author)</author>\n" + > " </book>\n" + > "</catalog>" > However, this creates a more complex expression for the type checker, and > there's still far more punctuation than ought to be necessary. If the most > important goal of Swift is making code readable, this kind of code falls far > short of that goal. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#proposed-solution>Proposed > solution > > We propose that, when Swift is parsing a string literal, if it reaches the > end of the line without encountering an end quote, it should look at the next > line. If it sees a quote at the beginning (a "continuation quote"), the > string literal contains a newline and then continues on that line. Otherwise, > the string literal is unterminated and syntactically invalid. > > Our sample above could thus be written as: > > let xml = "<?xml version=\"1.0\"?> > "<catalog> > " <book id=\"bk101\" empty=\"\"> > " <author>\(author)</author> > " </book> > "</catalog>" > If the second or subsequent lines had not begun with a quotation mark, or the > trailing quotation mark after the </catalog>tag had not been included, Swift > would have emitted an error. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#rationale>Rationale > > This design is rather unusual, and it's worth pausing a moment to explain why > it has been chosen. > > The traditional design for this feature, seen in languages like Perl and > Python, simply places one delimiter at the beginning of the literal and > another at the end. Individual lines in the literal are not marked in any > way. > > We think continuation quotes offer several important advantages over the > traditional design: > > They help the compiler pinpoint errors in string literal delimiting. > Traditional multiline strings have a serious weakness: if you forget the > closing quote, the compiler has no idea where you wanted the literal to end. > It simply continues on until the compiler encounters another quote (or the > end of the file). If you're lucky, the text after that quote is not valid > code, and the resulting error will at least point you to the next string > literal in the file. If you're unlucky, you'll get a seemingly unrelated > error several literals later, an unbalanced brace error at the end of the > file, or perhaps even code that compiles but does something totally wrong. > > (This is not a minor concern. Many popular languages, including C and Swift > 2, specifically reject newlines in string literals to prevent this from > happening.) > > Continuation quotes provide the compiler with redundant information about > your intent. If you forget a closing quote, the continuation quotes give the > compiler a very good idea of where you meant to put it. The compiler can > point you to (or at least very near) the end of the literal, where you want > to insert the quote, rather than showing you the beginning of the literal or > even some unrelated error later in the file that was caused by the missing > quote. > > Temporarily unclosed literals don't make editors go haywire. The syntax > highlighter has the same trouble parsing half-written, unclosed traditional > quotes that the compiler does: It can't tell where the literal is supposed to > end and the code should begin. It must either apply heuristics to try to > guess where the literal ends, or incorrectly color everything between the > opening quote and the next closing quote as a string literal. This can cause > the file's coloring to alternate distractingly between "string literal" and > "running code". > > Continuation quotes give the syntax highlighter enough context to guess at > the correct coloration, even when the string isn't complete yet. Lines with a > continuation quote are literals; lines without are code. At worst, the syntax > highlighter might incorrectly color a few characters at the end of a line, > rather than the remainder of the file. > > They separate indentation from the string's contents. Traditional multiline > strings usually include all of the content between the start and end > delimiters, including leading whitespace. This means that it's usually > impossible to indent a multiline string, so including one breaks up the flow > of the surrounding code, making it less readable. Some languages apply > heuristics or mode switches to try to remove indentation, but like all > heuristics, these are mistake-prone and murky. > > Continuation quotes neatly avoid this problem. Whitespace before the > continuation quote is indentation used to format the source code; whitespace > after the continuation quote is part of the string literal. The > interpretation of the code is perfectly clear to both compiler and programmer. > > They improve the ability to quickly recognize the literal. Traditional > multiline strings don't provide much visual help. To find the end, you must > visually scan until you find the matching delimiter, which may be only one or > a few characters long. When looking at a random line of source, it can be > hard to tell at a glance whether it's code or literal. Syntax highlighting > can help with these issues, but it's often unreliable, especially with > advanced, idiosyncratic string literal features like multiline strings. > > Continuation quotes solve these problems. To find the end of the literal, > just scan down the column of continuation characters until they end. To > figure out if a given line of source is part of a literal, just see if it > starts with a quote mark. The meaning of the source becomes obvious at a > glance. > > Nevertheless, the traditional design does has a few advantages: > > It is simpler. Although continuation quotes are more complex, we believe that > the advantages listed above pay for that complexity. > > There is no need to edit the intervening lines to add continuation quotes. > While the additional effort required to insert continuation quotes is an > important downside, we believe that tool support, including both compiler > fix-its and perhaps editor support for commands like "Paste as String > Literal", can address this issue. In some editors, new features aren't even > necessary; TextMate, for instance, lets you insert a character on several > lines simultaneously. And new tool features could also address other issues > like escaping embedded quotes. > > Naïve syntax highlighters may have trouble understanding this syntax. This is > true, but naïve syntax highlighters generally have terrible trouble with > advanced string literal constructs; some struggle with even basic ones. While > there are some designs (like Python's """ strings) which trick some syntax > highlighters into working some of the time with some contents, we don't think > this occasional, accidental compatibility is a big enough gain to justify > changing the design. > > It looks funny—quotes should always be in matched pairs. We aren't aware of > another programming language which uses unbalanced quotes in string literals, > but there is one very important precedent for this kind of formatting: > natural languages. English, for instance, uses a very similar format for > quoting multiple lines of dialog by the same speaker. As an English Stack > Exchange answer illustrates <http://english.stackexchange.com/a/96613/64636>: > > “That seems like an odd way to use punctuation,” Tom said. “What harm would > there be in using quotation marks at the end of every paragraph?” > > “Oh, that’s not all that complicated,” J.R. answered. “If you closed quotes > at the end of every paragraph, then you would need to reidentify the speaker > with every subsequent paragraph. > > “Say a narrative was describing two or three people engaged in a lengthy > conversation. If you closed the quotation marks in the previous paragraph, > then a reader wouldn’t be able to easily tell if the previous speaker was > extending his point, or if someone else in the room had picked up the > conversation. By leaving the previous paragraph’s quote unclosed, the reader > knows that the previous speaker is still the one talking.” > > “Oh, that makes sense. Thanks!” > In English, omitting the ending quotation mark tells the text's reader that > the quote continues on the next line, while including a quotation mark at the > beginning of the next line reminds the reader that they're in the middle of a > quote. > > Similarly, in this proposal, omitting the ending quotation mark tells the > code's reader (and compiler) that the string literal continues on the next > line, while including a quotation mark at the beginning of the next line > reminds the reader (and compiler) that they're in the middle of a string > literal. > > On balance, we think continuation quotes are the best design for this problem. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#detailed-design>Detailed > design > > When Swift is parsing a string literal and reaches the end of a line without > finding a closing quote, it examines the next line, applying the following > rules: > > If the next line begins with whitespace followed by a continuation quote, > then the string literal contains a newline followed by the contents of the > string literal starting on that line. (This line may itself have no closing > quote, in which case the same rules apply to the line which follows.) > > If the next line contains anything else, Swift raises a syntax error for an > unterminated string literal. > > The exact error messages and diagnostics provided are left to the > implementers to determine, but we believe it should be possible to provide > two fix-its which will help users learn the syntax and correct string literal > mistakes: > > Insert " at the end of the current line to terminate the quote. > > Insert " at the beginning of the next line (with some indentation heuristics) > to continue the quote on the next line. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#impact-on-existing-code>Impact > on existing code > > Failing to close a string literal before the end of the line is currently a > syntax error, so no valid Swift code should be affected by this change. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#future-directions-for-multiline-string-literals>Future > directions for multiline string literals > > We could permit comments before encountering a continuation quote to be > counted as whitespace, and permit empty lines in the middle of string > literals. This would allow you to comment out whole lines in the literal. > > We could allow you to put a trailing backslash on a line to indicate that the > newline isn't "real" and should be omitted from the literal's contents. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#future-directions-for-string-literals-in-general>Future > directions for string literals in general > > There are other issues with Swift's string handling which this proposal > intentionally does not address: > > Reducing the amount of double-backslashing needed when working with regular > expression libraries, Windows paths, source code generation, and other tasks > where backslashes are part of the data. > > Alternate delimiters or other strategies for writing strings with " > characters in them. > > Accommodating code formatting concerns like hard wrapping and commenting. > > String literals consisting of very long pieces of text which are best > represented completely verbatim, with minimal alteration. > > This section briefly outlines some future proposals which might address these > issues. Combined, we believe they would address most of the string literal > use cases which Swift is currently not very good at. > > Please note that these are simply sketches of hypothetical future designs; > they may radically change before proposal, and some may never be proposed at > all. Many, perhaps most, will not be proposed for Swift 3. We are sketching > these designs not to propose and refine these features immediately, but > merely to show how we think they might be solved in ways which complement > this proposal. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#string-literal-modifiers>String > literal modifiers > > A string literal modifier is a cluster of identifier characters which goes > before a string literal and adjusts the way it is parsed. Modifers only alter > the interpretation of the text in the literal, not the type of data it > produces; for instance, there will never be something like the > UTF-8/UTF-16/UTF-32 literal modifiers in C++. Uppercase characters enable a > feature; lowercase characters disable a feature. > > Modifiers can be attached to both single-line and multiline literals, and > could also be attached to other literal syntaxes which might be introduced in > the future. When used with multiline strings, only the starting quote needs > to carry the modifiers, not the continuation quotes. > > Modifiers are an extremely flexible feature which can be used for many > proposes. Of the ideas listed below, we believe the e modifier is an urgent > addition which should be included in Swift 3 if at all possible; the others > are less urgent and most of them could be deferred, or at least added later > if time allows. > > Escape disabling: e"\\\" (string with three backslash characters) > > Fine-grained escape disabling: i"\(foo)\n" (the string \(foo) followed by a > newline); eI"\(foo)\n" (the contents of foo followed by the string \n), > b"\w+\n" (the string \w+ followed by a newline) > > Alternate delimiters: _ has no lowercase form, so it could be used to allow > strings with internal quotes: _"print("Hello, world!")"_, __"print("Hello, > world!")"__, etc. > > Whitespace normalization: changes all runs of whitespace in the literal to > single space characters; this would allow you to use multiline strings purely > to improve code formatting. > > alert.informativeText = > W"\(appName) could not typeset the element “\(title)” because > "it includes a link to an element that has been removed from this > "book." > Localization: > > alert.informativeText = > LW"\(appName) could not typeset the element “\(title)” because > "it includes a link to an element that has been removed from this > "book." > Comments: Embedding comments in string literals might be useful for literals > containing regular expressions or other code. > > Eventually, user-specified string modifiers could be added to Swift, perhaps > as part of a hygienic macro system. It might also become possible to change > the default modifiers applied to literals in a particular file or scope. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#heredocs-or-other-verbatim-string-literal-features>Heredocs > or other "verbatim string literal" features > > Sometimes it really is best to just splat something else down in the middle > of a file full of Swift source code. Maybe the file is essentially a template > and the literals are a majority of the code's contents, or maybe you're > writing a code generator and just want to get string data into it with > minimal fuss, or maybe people unfamiliar with Swift need to be able to edit > the literals. Whatever the reason, the normal string literal syntax is just > too burdensome. > > One approach to this problem is heredocs. A heredoc allows you to put a > placeholder for a literal on one line; the contents of the literal begin on > the next line, running up to some delimiter. It would be possible to put > multiple placeholders in a single line, and to apply string modifiers to them. > > In Swift, this might look like: > > print(#to("---") + e#to("END")) > It was a dark and stormy \(timeOfDay) when > --- > the Swift core team invented the \(interpolation) syntax. > END > Another possible approach would be to support traditional multiline string > literals bounded by a different delimiter, like """. This might look like: > > print(""" > It was a dark and stormy \(timeOfDay) when > """ + e""" > the Swift core team invented the \(interpolation) syntax. > """) > Although heredocs could make a good addition to Swift eventually, there are > good reasons to defer them for now. Please see the "Alternatives considered" > section for details. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#first-class-regular-expressions>First-class > regular expressions > > Members of the core team are interested in regular expressions, but they > don't want to just build a literal that wraps PCRE or libicu; rather, they > aim to integrate regexes into the pattern matching system and give them a > deep, Perl 6-style rethink. This would be a major effort, far beyond the > scope of Swift 3. > > In the meantime, the e modifier and perhaps other string literal modifiers > will make it easier to specify regular expressions in string literals for use > with NSRegularExpression and other libraries accessible from Swift. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#alternatives-considered>Alternatives > considered > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#requiring-no-continuation-character>Requiring > no continuation character > > The main alternative is to not require a continuation quote, and simply > extend the string literal from the starting quote to the ending quote, > including all newlines between them. For example: > > let xml = "<?xml version=\"1.0\"?> > <catalog> > <book id=\"bk101\" empty=\"\"> > <author>\(author)</author> > </book> > </catalog>" > This alternative is extensively discussed in the "Rationale" section above. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#skip-multiline-strings-and-just-support-heredocs>Skip > multiline strings and just support heredocs > > There are definitely cases where a heredoc would be a better solution, such > as generated code or code which is mostly literals with a little Swift > sprinkled around. On the other hand, there are also cases where multiline > strings are better: short strings in code which is meant to be read. If a > single feature can't handle them both well, there's no shame in supporting > the two features separately. > > It makes sense to support multiline strings first because: > > They extend existing syntax instead of introducing new syntax. > > They are much easier to parse; heredocs require some kind of mode in the > parser which kicks in at the start of the next line, whereas multiline string > literals can be handled in the lexer. > > As discussed in "Rationale", they offer better diagnostics, code formatting, > and visual scannability. > > > <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#use-a-different-delimiter-for-multiline-strings>Use > a different delimiter for multiline strings > > The initial suggestion was that multiline strings should use a different > delimiter, """, at the beginning and end of the string, with no continuation > characters between. Like heredocs, this might be a good alternative for > certain use cases, but it has the same basic flaws as the "no continuation > character" solution. > > -- > Brent Royal-Gordon > Architechies > > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
