is it just me who would prefer a multiline string literal to not require a \backslash before each "double quote"?
Did you ever really use multiline string literals before? I did, and it's mostly for quick hacks where I wrote a script or tried something out quickly. And maybe I needed to put an XML snippet into a unit test case to see if my parser correctly parses or correctly rejects the snippet. The current proposal doesn't help this use case in any way. I cannot see which use case inspires multiline string literals which require double quotes to be escaped... I wouldn't use them if they were available. I'd become an Android developer instead ;) -Michael > Am 28.04.2016 um 23:56 schrieb Brent Royal-Gordon via swift-evolution > <[email protected]>: > >> Awesome. Some specific suggestions below, but feel free to iterate in a >> pull request if you prefer that. > > I've adopted these suggestions in some form, though I also ended up rewriting > the explanation of why the feature was designed as it is and fusing it with > material from "Alternatives considered". > > (Still not sure who I should list as a co-author. I'm currently thinking > John, Tyler, and maybe Chris? Who's supposed to go there?) > > Multiline string literals > > • Proposal: SE-NNNN > • Author(s): Brent Royal-Gordon > • Status: Second Draft > • Review manager: TBD > Introduction > > In Swift 2.2, the only means to insert a newline into a string literal is the > \n escape. String literals specified in this way are generally ugly and > unreadable. We propose a multiline string feature inspired by English > punctuation which is a straightforward extension of our existing string > literals. > > This proposal is one step in a larger plan to improve how string literals > address various challenging use cases. It is not meant to solve all problems > with escaping, nor to serve all use cases involving very long string > literals. See the "Future directions for string literals in general" section > for a sketch of the problems we ultimately want to address and some ideas of > how we might do so. > > Swift-evolution threads: multi-line string literals. (April), multi-line > string literals (December) > > Draft Notes > > • Removes the comment feature, which was felt to be an unnecessary > complication. This and the backslash feature have been listed as future > directions. > > • Loosens the specification of diagnostics, suggesting instead of > requiring fix-its. > > • Splits a "Rationale" section out of the "Proposed solution" section. > > • Adds extensive discussion of other features which wold combine with > this one. > > • I've listed only myself as an author because I don't want to put > anyone else's name to a document they haven't seen, but there are others who > deserve to be listed (John Holdsworth at least). Let me know if you think you > should be included. > > Motivation > > As Swift begins to move into roles beyond app development, code which needs > to generate text becomes a more important use case. Consider, for instance, > generating even a small XML string: > > let xml = "<?xml version=\"1.0\"?>\n<catalog>\n\t<book id=\"bk101\" > empty=\"\">\n\t\t<author>\(author)</author>\n\t</book>\n</catalog>" > The string is practically unreadable, its structure drowned in escapes and > run-together lines; it looks like little more than line noise. We can improve > its readability somewhat by concatenating separate strings for each line and > using real tabs instead of \t escapes: > > let xml = "<?xml version=\"1.0\"?>\n" + > > > "<catalog>\n" + > > > " <book id=\"bk101\" empty=\"\">\n" + > > > " <author>\(author)</author>\n" + > > > " </book>\n" + > > > "</catalog>" > However, this creates a more complex expression for the type checker, and > there's still far more punctuation than ought to be necessary. If the most > important goal of Swift is making code readable, this kind of code falls far > short of that goal. > > Proposed solution > > We propose that, when Swift is parsing a string literal, if it reaches the > end of the line without encountering an end quote, it should look at the next > line. If it sees a quote at the beginning (a "continuation quote"), the > string literal contains a newline and then continues on that line. Otherwise, > the string literal is unterminated and syntactically invalid. > > Our sample above could thus be written as: > > let xml = "<?xml version=\"1.0\"?> > "<catalog> > " <book id=\"bk101\" empty=\"\"> > " <author>\(author)</author> > " </book> > "</catalog>" > > If the second or subsequent lines had not begun with a quotation mark, or the > trailing quotation mark after the </catalog>tag had not been included, Swift > would have emitted an error. > > Rationale > > This design is rather unusual, and it's worth pausing a moment to explain why > it has been chosen. > > The traditional design for this feature, seen in languages like Perl and > Python, simply places one delimiter at the beginning of the literal and > another at the end. Individual lines in the literal are not marked in any > way. > > We think continuation quotes offer several important advantages over the > traditional design: > > • They help the compiler pinpoint errors in string literal delimiting. > Traditional multiline strings have a serious weakness: if you forget the > closing quote, the compiler has no idea where you wanted the literal to end. > It simply continues on until the compiler encounters another quote (or the > end of the file). If you're lucky, the text after that quote is not valid > code, and the resulting error will at least point you to the next string > literal in the file. If you're unlucky, you'll get a seemingly unrelated > error several literals later, an unbalanced brace error at the end of the > file, or perhaps even code that compiles but does something totally wrong. > > (This is not a minor concern. Many popular languages, including C and Swift > 2, specifically reject newlines in string literals to prevent this from > happening.) > > Continuation quotes provide the compiler with redundant information about > your intent. If you forget a closing quote, the continuation quotes give the > compiler a very good idea of where you meant to put it. The compiler can > point you to (or at least very near) the end of the literal, where you want > to insert the quote, rather than showing you the beginning of the literal or > even some unrelated error later in the file that was caused by the missing > quote. > > • Temporarily unclosed literals don't make editors go haywire. The > syntax highlighter has the same trouble parsing half-written, unclosed > traditional quotes that the compiler does: It can't tell where the literal is > supposed to end and the code should begin. It must either apply heuristics to > try to guess where the literal ends, or incorrectly color everything between > the opening quote and the next closing quote as a string literal. This can > cause the file's coloring to alternate distractingly between "string literal" > and "running code". > > Continuation quotes give the syntax highlighter enough context to guess at > the correct coloration, even when the string isn't complete yet. Lines with a > continuation quote are literals; lines without are code. At worst, the syntax > highlighter might incorrectly color a few characters at the end of a line, > rather than the remainder of the file. > > • They separate indentation from the string's contents. Traditional > multiline strings usually include all of the content between the start and > end delimiters, including leading whitespace. This means that it's usually > impossible to indent a multiline string, so including one breaks up the flow > of the surrounding code, making it less readable. Some languages apply > heuristics or mode switches to try to remove indentation, but like all > heuristics, these are mistake-prone and murky. > > Continuation quotes neatly avoid this problem. Whitespace before the > continuation quote is indentation used to format the source code; whitespace > after the continuation quote is part of the string literal. The > interpretation of the code is perfectly clear to both compiler and programmer. > > • They improve the ability to quickly recognize the literal. > Traditional multiline strings don't provide much visual help. To find the > end, you must visually scan until you find the matching delimiter, which may > be only one or a few characters long. When looking at a random line of > source, it can be hard to tell at a glance whether it's code or literal. > Syntax highlighting can help with these issues, but it's often unreliable, > especially with advanced, idiosyncratic string literal features like > multiline strings. > > Continuation quotes solve these problems. To find the end of the literal, > just scan down the column of continuation characters until they end. To > figure out if a given line of source is part of a literal, just see if it > starts with a quote mark. The meaning of the source becomes obvious at a > glance. > > Nevertheless, the traditional design does has a few advantages: > > • It is simpler. Although continuation quotes are more complex, we > believe that the advantages listed above pay for that complexity. > > • There is no need to edit the intervening lines to add continuation > quotes. While the additional effort required to insert continuation quotes is > an important downside, we believe that tool support, including both compiler > fix-its and perhaps editor support for commands like "Paste as String > Literal", can address this issue. In some editors, new features aren't even > necessary; TextMate, for instance, lets you insert a character on several > lines simultaneously. And new tool features could also address other issues > like escaping embedded quotes. > > • Naïve syntax highlighters may have trouble understanding this syntax. > This is true, but naïve syntax highlighters generally have terrible trouble > with advanced string literal constructs; some struggle with even basic ones. > While there are some designs (like Python's """ strings) which trick some > syntax highlighters into working some of the time with some contents, we > don't think this occasional, accidental compatibility is a big enough gain to > justify changing the design. > > • It looks funny—quotes should always be in matched pairs. We aren't > aware of another programming language which uses unbalanced quotes in string > literals, but there is one very important precedent for this kind of > formatting: natural languages. English, for instance, uses a very similar > format for quoting multiple lines of dialog by the same speaker. As an > English Stack Exchange answer illustrates: > > “That seems like an odd way to use punctuation,” Tom said. “What harm would > there be in using quotation marks at the end of every paragraph?” > > “Oh, that’s not all that complicated,” J.R. answered. “If you closed quotes > at the end of every paragraph, then you would need to reidentify the speaker > with every subsequent paragraph. > > “Say a narrative was describing two or three people engaged in a lengthy > conversation. If you closed the quotation marks in the previous paragraph, > then a reader wouldn’t be able to easily tell if the previous speaker was > extending his point, or if someone else in the room had picked up the > conversation. By leaving the previous paragraph’s quote unclosed, the reader > knows that the previous speaker is still the one talking.” > > “Oh, that makes sense. Thanks!” > In English, omitting the ending quotation mark tells the text's reader that > the quote continues on the next line, while including a quotation mark at the > beginning of the next line reminds the reader that they're in the middle of a > quote. > > Similarly, in this proposal, omitting the ending quotation mark tells the > code's reader (and compiler) that the string literal continues on the next > line, while including a quotation mark at the beginning of the next line > reminds the reader (and compiler) that they're in the middle of a string > literal. > > On balance, we think continuation quotes are the best design for this problem. > > Detailed design > > When Swift is parsing a string literal and reaches the end of a line without > finding a closing quote, it examines the next line, applying the following > rules: > > • If the next line begins with whitespace followed by a continuation > quote, then the string literal contains a newline followed by the contents of > the string literal starting on that line. (This line may itself have no > closing quote, in which case the same rules apply to the line which follows.) > > • If the next line contains anything else, Swift raises a syntax error > for an unterminated string literal. > > The exact error messages and diagnostics provided are left to the > implementers to determine, but we believe it should be possible to provide > two fix-its which will help users learn the syntax and correct string literal > mistakes: > > • Insert " at the end of the current line to terminate the quote. > > • Insert " at the beginning of the next line (with some indentation > heuristics) to continue the quote on the next line. > > Impact on existing code > > Failing to close a string literal before the end of the line is currently a > syntax error, so no valid Swift code should be affected by this change. > > Future directions for multiline string literals > > • We could permit comments before encountering a continuation quote to > be counted as whitespace, and permit empty lines in the middle of string > literals. This would allow you to comment out whole lines in the literal. > > • We could allow you to put a trailing backslash on a line to indicate > that the newline isn't "real" and should be omitted from the literal's > contents. > > Future directions for string literals in general > > There are other issues with Swift's string handling which this proposal > intentionally does not address: > > • Reducing the amount of double-backslashing needed when working with > regular expression libraries, Windows paths, source code generation, and > other tasks where backslashes are part of the data. > > • Alternate delimiters or other strategies for writing strings with " > characters in them. > > • Accommodating code formatting concerns like hard wrapping and > commenting. > > • String literals consisting of very long pieces of text which are best > represented completely verbatim, with minimal alteration. > > This section briefly outlines some future proposals which might address these > issues. Combined, we believe they would address most of the string literal > use cases which Swift is currently not very good at. > > Please note that these are simply sketches of hypothetical future designs; > they may radically change before proposal, and some may never be proposed at > all. Many, perhaps most, will not be proposed for Swift 3. We are sketching > these designs not to propose and refine these features immediately, but > merely to show how we think they might be solved in ways which complement > this proposal. > > String literal modifiers > > A string literal modifier is a cluster of identifier characters which goes > before a string literal and adjusts the way it is parsed. Modifers only alter > the interpretation of the text in the literal, not the type of data it > produces; for instance, there will never be something like the > UTF-8/UTF-16/UTF-32 literal modifiers in C++. Uppercase characters enable a > feature; lowercase characters disable a feature. > > Modifiers can be attached to both single-line and multiline literals, and > could also be attached to other literal syntaxes which might be introduced in > the future. When used with multiline strings, only the starting quote needs > to carry the modifiers, not the continuation quotes. > > Modifiers are an extremely flexible feature which can be used for many > proposes. Of the ideas listed below, we believe the e modifier is an urgent > addition which should be included in Swift 3 if at all possible; the others > are less urgent and most of them could be deferred, or at least added later > if time allows. > > • Escape disabling: e"\\\" (string with three backslash characters) > > • Fine-grained escape disabling: i"\(foo)\n" (the string \(foo) > followed by a newline); eI"\(foo)\n" (the contents of foo followed by the > string \n), b"\w+\n" (the string \w+ followed by a newline) > > • Alternate delimiters: _ has no lowercase form, so it could be used to > allow strings with internal quotes: _"print("Hello, world!")"_, > __"print("Hello, world!")"__, etc. > > • Whitespace normalization: changes all runs of whitespace in the > literal to single space characters; this would allow you to use multiline > strings purely to improve code formatting. > > alert.informativeText = > W"\(appName) could not typeset the element “\(title)” because > "it includes a link to an element that has been removed from this > "book." > > • Localization: > > alert.informativeText = > LW"\(appName) could not typeset the element “\(title)” because > "it includes a link to an element that has been removed from this > "book." > > • Comments: Embedding comments in string literals might be useful for > literals containing regular expressions or other code. > > Eventually, user-specified string modifiers could be added to Swift, perhaps > as part of a hygienic macro system. It might also become possible to change > the default modifiers applied to literals in a particular file or scope. > > Heredocs or other "verbatim string literal" features > > Sometimes it really is best to just splat something else down in the middle > of a file full of Swift source code. Maybe the file is essentially a template > and the literals are a majority of the code's contents, or maybe you're > writing a code generator and just want to get string data into it with > minimal fuss, or maybe people unfamiliar with Swift need to be able to edit > the literals. Whatever the reason, the normal string literal syntax is just > too burdensome. > > One approach to this problem is heredocs. A heredoc allows you to put a > placeholder for a literal on one line; the contents of the literal begin on > the next line, running up to some delimiter. It would be possible to put > multiple placeholders in a single line, and to apply string modifiers to them. > > In Swift, this might look like: > > print(#to("---") + e#to("END" > )) > It was a dark and stormy \(timeOfDay) when > --- > the Swift core team invented the \(interpolation) syntax. > END > > Another possible approach would be to support traditional multiline string > literals bounded by a different delimiter, like """. This might look like: > > print(""" > It was a dark and stormy \(timeOfDay) when > """ + e""" > the Swift core team invented the \(interpolation) syntax. > """) > Although heredocs could make a good addition to Swift eventually, there are > good reasons to defer them for now. Please see the "Alternatives considered" > section for details. > > First-class regular expressions > > Members of the core team are interested in regular expressions, but they > don't want to just build a literal that wraps PCRE or libicu; rather, they > aim to integrate regexes into the pattern matching system and give them a > deep, Perl 6-style rethink. This would be a major effort, far beyond the > scope of Swift 3. > > In the meantime, the e modifier and perhaps other string literal modifiers > will make it easier to specify regular expressions in string literals for use > with NSRegularExpression and other libraries accessible from Swift. > > Alternatives considered > > Requiring no continuation character > > The main alternative is to not require a continuation quote, and simply > extend the string literal from the starting quote to the ending quote, > including all newlines between them. For example: > > let xml = "<?xml version=\"1.0\"?> > <catalog> > <book id=\"bk101\" empty=\"\"> > <author>\(author)</author> > </book> > </catalog>" > This alternative is extensively discussed in the "Rationale" section above. > > Skip multiline strings and just support heredocs > > There are definitely cases where a heredoc would be a better solution, such > as generated code or code which is mostly literals with a little Swift > sprinkled around. On the other hand, there are also cases where multiline > strings are better: short strings in code which is meant to be read. If a > single feature can't handle them both well, there's no shame in supporting > the two features separately. > > It makes sense to support multiline strings first because: > > • They extend existing syntax instead of introducing new syntax. > > • They are much easier to parse; heredocs require some kind of mode in > the parser which kicks in at the start of the next line, whereas multiline > string literals can be handled in the lexer. > > • As discussed in "Rationale", they offer better diagnostics, code > formatting, and visual scannability. > > Use a different delimiter for multiline strings > > The initial suggestion was that multiline strings should use a different > delimiter, """, at the beginning and end of the string, with no continuation > characters between. Like heredocs, this might be a good alternative for > certain use cases, but it has the same basic flaws as the "no continuation > character" solution. > > -- > Brent Royal-Gordon > Architechies > > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution _______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
