> On Apr 30, 2016, at 11:54 AM, Tyler Fleming Cloutier via swift-evolution
> <[email protected]> wrote:
>
>
>> On Apr 28, 2016, at 2:56 PM, Brent Royal-Gordon via swift-evolution
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>>> Awesome. Some specific suggestions below, but feel free to iterate in a
>>> pull request if you prefer that.
>>
>> I've adopted these suggestions in some form, though I also ended up
>> rewriting the explanation of why the feature was designed as it is and
>> fusing it with material from "Alternatives considered".
>>
>> (Still not sure who I should list as a co-author. I'm currently thinking
>> John, Tyler, and maybe Chris? Who's supposed to go there?)
>>
>> Multiline string literals
>>
>> Proposal: SE-NNNN
>> <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md>
>> Author(s): Brent Royal-Gordon <https://github.com/brentdax>
>> Status: Second Draft
>> Review manager: TBD
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#introduction>Introduction
>>
>> In Swift 2.2, the only means to insert a newline into a string literal is
>> the \n escape. String literals specified in this way are generally ugly and
>> unreadable. We propose a multiline string feature inspired by English
>> punctuation which is a straightforward extension of our existing string
>> literals.
>>
>> This proposal is one step in a larger plan to improve how string literals
>> address various challenging use cases. It is not meant to solve all problems
>> with escaping, nor to serve all use cases involving very long string
>> literals. See the "Future directions for string literals in general" section
>> for a sketch of the problems we ultimately want to address and some ideas of
>> how we might do so.
>>
>> Swift-evolution threads: multi-line string literals. (April)
>> <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160418/015500.html>,
>> multi-line string literals (December)
>> <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20151214/002349.html>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#draft-notes>Draft
>> Notes
>>
>> Removes the comment feature, which was felt to be an unnecessary
>> complication. This and the backslash feature have been listed as future
>> directions.
>>
>> Loosens the specification of diagnostics, suggesting instead of requiring
>> fix-its.
>>
>> Splits a "Rationale" section out of the "Proposed solution" section.
>>
>> Adds extensive discussion of other features which wold combine with this one.
>>
>> I've listed only myself as an author because I don't want to put anyone
>> else's name to a document they haven't seen, but there are others who
>> deserve to be listed (John Holdsworth at least). Let me know if you think
>> you should be included.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#motivation>Motivation
>>
>> As Swift begins to move into roles beyond app development, code which needs
>> to generate text becomes a more important use case. Consider, for instance,
>> generating even a small XML string:
>>
>> let xml = "<?xml version=\"1.0\"?>\n<catalog>\n\t<book id=\"bk101\"
>> empty=\"\">\n\t\t<author>\(author)</author>\n\t</book>\n</catalog>"
>> The string is practically unreadable, its structure drowned in escapes and
>> run-together lines; it looks like little more than line noise. We can
>> improve its readability somewhat by concatenating separate strings for each
>> line and using real tabs instead of \t escapes:
>>
>> let xml = "<?xml version=\"1.0\"?>\n" +
>> "<catalog>\n" +
>> " <book id=\"bk101\" empty=\"\">\n" +
>> " <author>\(author)</author>\n" +
>> " </book>\n" +
>> "</catalog>"
>> However, this creates a more complex expression for the type checker, and
>> there's still far more punctuation than ought to be necessary. If the most
>> important goal of Swift is making code readable, this kind of code falls far
>> short of that goal.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#proposed-solution>Proposed
>> solution
>>
>> We propose that, when Swift is parsing a string literal, if it reaches the
>> end of the line without encountering an end quote, it should look at the
>> next line. If it sees a quote at the beginning (a "continuation quote"), the
>> string literal contains a newline and then continues on that line.
>> Otherwise, the string literal is unterminated and syntactically invalid.
>>
> One other way to implement the feature would be to allow quotes to be
> terminated by either a close quote or an end of line character. Multiline
> literals would then be constructed by concatenating adjacent (e.i. separated
> by only comments or whitespace) string literals.
>
> There is an issue with this that
>
> let foo = “bar
>
> would be a valid string whose value would be “bar\n”, even though that might
> not be the intended result. There is also the issue of things like
>
> let foo = [
> “string1”,
> “string2”
> “string3"
> ]
>
> becoming [“string1”, “string2string3”]. This is something that can happen in
> Python, for example.
>
> However, this has a few benefits. Namely, that it simplifies the model and
> that if I’m just pasting in a block of text I don’t have to add a trailing
> quote to the last line. If I was in Vim for example, I could just
> visual-block add a column of quotes at the beginning. So it would be:
>
> let xml = "<?xml version=\"1.0\"?>
> "<catalog>
> " <book id=\"bk101\" empty=\"\">
> " <author>\(author)</author>
> " </book>
> "</catalog>
>
Just one amendment, in order to have the same string you would still have to
write:
let xml = "<?xml version=\"1.0\"?>
"<catalog>
" <book id=\"bk101\" empty=\"\">
" <author>\(author)</author>
" </book>
"</catalog>"
Or else an unwanted newline would be appended to the end. Additionally Brent
made an excellent point about having ending delimiters for the following case:
let xml = "<?xml version=\"1.0\"?>
"<catalog>
" <book id=\"bk101\" empty=\"\">
" <author>\(author)</author>
" </book>
"</catalog>".encoded(as: .UTF8)
>
>
>> Our sample above could thus be written as:
>>
>> let xml = "<?xml version=\"1.0\"?>
>> "<catalog>
>> " <book id=\"bk101\" empty=\"\">
>> " <author>\(author)</author>
>> " </book>
>> "</catalog>"
>> If the second or subsequent lines had not begun with a quotation mark, or
>> the trailing quotation mark after the </catalog>tag had not been included,
>> Swift would have emitted an error.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#rationale>Rationale
>>
>> This design is rather unusual, and it's worth pausing a moment to explain
>> why it has been chosen.
>>
>> The traditional design for this feature, seen in languages like Perl and
>> Python, simply places one delimiter at the beginning of the literal and
>> another at the end. Individual lines in the literal are not marked in any
>> way.
>>
>> We think continuation quotes offer several important advantages over the
>> traditional design:
>>
>> They help the compiler pinpoint errors in string literal delimiting.
>> Traditional multiline strings have a serious weakness: if you forget the
>> closing quote, the compiler has no idea where you wanted the literal to end.
>> It simply continues on until the compiler encounters another quote (or the
>> end of the file). If you're lucky, the text after that quote is not valid
>> code, and the resulting error will at least point you to the next string
>> literal in the file. If you're unlucky, you'll get a seemingly unrelated
>> error several literals later, an unbalanced brace error at the end of the
>> file, or perhaps even code that compiles but does something totally wrong.
>>
>> (This is not a minor concern. Many popular languages, including C and Swift
>> 2, specifically reject newlines in string literals to prevent this from
>> happening.)
>>
>> Continuation quotes provide the compiler with redundant information about
>> your intent. If you forget a closing quote, the continuation quotes give the
>> compiler a very good idea of where you meant to put it. The compiler can
>> point you to (or at least very near) the end of the literal, where you want
>> to insert the quote, rather than showing you the beginning of the literal or
>> even some unrelated error later in the file that was caused by the missing
>> quote.
>>
>> Temporarily unclosed literals don't make editors go haywire. The syntax
>> highlighter has the same trouble parsing half-written, unclosed traditional
>> quotes that the compiler does: It can't tell where the literal is supposed
>> to end and the code should begin. It must either apply heuristics to try to
>> guess where the literal ends, or incorrectly color everything between the
>> opening quote and the next closing quote as a string literal. This can cause
>> the file's coloring to alternate distractingly between "string literal" and
>> "running code".
>>
>> Continuation quotes give the syntax highlighter enough context to guess at
>> the correct coloration, even when the string isn't complete yet. Lines with
>> a continuation quote are literals; lines without are code. At worst, the
>> syntax highlighter might incorrectly color a few characters at the end of a
>> line, rather than the remainder of the file.
>>
>> They separate indentation from the string's contents. Traditional multiline
>> strings usually include all of the content between the start and end
>> delimiters, including leading whitespace. This means that it's usually
>> impossible to indent a multiline string, so including one breaks up the flow
>> of the surrounding code, making it less readable. Some languages apply
>> heuristics or mode switches to try to remove indentation, but like all
>> heuristics, these are mistake-prone and murky.
>>
>>
> Scala has an interesting solution to this problem which doesn’t involve a
> mode, but rather a function that strips out whitespace before the |
> character. In this case the | character serves a very similar purpose to the
> continuation quote. The particular character can be passed to the function as
> an argument.
>
> https://www.safaribooksonline.com/library/view/scala-cookbook/9781449340292/ch01s03.html
>
> <https://www.safaribooksonline.com/library/view/scala-cookbook/9781449340292/ch01s03.html>
>> Continuation quotes neatly avoid this problem. Whitespace before the
>> continuation quote is indentation used to format the source code; whitespace
>> after the continuation quote is part of the string literal. The
>> interpretation of the code is perfectly clear to both compiler and
>> programmer.
>>
>> They improve the ability to quickly recognize the literal. Traditional
>> multiline strings don't provide much visual help. To find the end, you must
>> visually scan until you find the matching delimiter, which may be only one
>> or a few characters long. When looking at a random line of source, it can be
>> hard to tell at a glance whether it's code or literal. Syntax highlighting
>> can help with these issues, but it's often unreliable, especially with
>> advanced, idiosyncratic string literal features like multiline strings.
>>
>> Continuation quotes solve these problems. To find the end of the literal,
>> just scan down the column of continuation characters until they end. To
>> figure out if a given line of source is part of a literal, just see if it
>> starts with a quote mark. The meaning of the source becomes obvious at a
>> glance.
>>
>> Nevertheless, the traditional design does has a few advantages:
>>
>> It is simpler. Although continuation quotes are more complex, we believe
>> that the advantages listed above pay for that complexity.
>>
>> There is no need to edit the intervening lines to add continuation quotes.
>> While the additional effort required to insert continuation quotes is an
>> important downside, we believe that tool support, including both compiler
>> fix-its and perhaps editor support for commands like "Paste as String
>> Literal", can address this issue. In some editors, new features aren't even
>> necessary; TextMate, for instance, lets you insert a character on several
>> lines simultaneously. And new tool features could also address other issues
>> like escaping embedded quotes.
>>
> Although I was concerned about this, most editors do have some way of
> inserting a column of characters which would reduce the burden of pasting in
> code. And although enabling/disabling escaping is an orthogonal feature,
> allowing the _” syntax to disable escaping would allow you to paste in code
> with no other modifications.
>> Naïve syntax highlighters may have trouble understanding this syntax. This
>> is true, but naïve syntax highlighters generally have terrible trouble with
>> advanced string literal constructs; some struggle with even basic ones.
>> While there are some designs (like Python's """ strings) which trick some
>> syntax highlighters into working some of the time with some contents, we
>> don't think this occasional, accidental compatibility is a big enough gain
>> to justify changing the design.
>>
>> It looks funny—quotes should always be in matched pairs. We aren't aware of
>> another programming language which uses unbalanced quotes in string
>> literals, but there is one very important precedent for this kind of
>> formatting: natural languages. English, for instance, uses a very similar
>> format for quoting multiple lines of dialog by the same speaker. As an
>> English Stack Exchange answer illustrates
>> <http://english.stackexchange.com/a/96613/64636>:
>>
>> “That seems like an odd way to use punctuation,” Tom said. “What harm would
>> there be in using quotation marks at the end of every paragraph?”
>>
>> “Oh, that’s not all that complicated,” J.R. answered. “If you closed quotes
>> at the end of every paragraph, then you would need to reidentify the speaker
>> with every subsequent paragraph.
>>
>> “Say a narrative was describing two or three people engaged in a lengthy
>> conversation. If you closed the quotation marks in the previous paragraph,
>> then a reader wouldn’t be able to easily tell if the previous speaker was
>> extending his point, or if someone else in the room had picked up the
>> conversation. By leaving the previous paragraph’s quote unclosed, the reader
>> knows that the previous speaker is still the one talking.”
>>
>> “Oh, that makes sense. Thanks!”
>> In English, omitting the ending quotation mark tells the text's reader that
>> the quote continues on the next line, while including a quotation mark at
>> the beginning of the next line reminds the reader that they're in the middle
>> of a quote.
>>
>> Similarly, in this proposal, omitting the ending quotation mark tells the
>> code's reader (and compiler) that the string literal continues on the next
>> line, while including a quotation mark at the beginning of the next line
>> reminds the reader (and compiler) that they're in the middle of a string
>> literal.
>>
> This is very interesting, I never knew!
>
>
>> On balance, we think continuation quotes are the best design for this
>> problem.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#detailed-design>Detailed
>> design
>>
>> When Swift is parsing a string literal and reaches the end of a line without
>> finding a closing quote, it examines the next line, applying the following
>> rules:
>>
>> If the next line begins with whitespace followed by a continuation quote,
>> then the string literal contains a newline followed by the contents of the
>> string literal starting on that line. (This line may itself have no closing
>> quote, in which case the same rules apply to the line which follows.)
>>
>> If the next line contains anything else, Swift raises a syntax error for an
>> unterminated string literal.
>>
>> The exact error messages and diagnostics provided are left to the
>> implementers to determine, but we believe it should be possible to provide
>> two fix-its which will help users learn the syntax and correct string
>> literal mistakes:
>>
>> Insert " at the end of the current line to terminate the quote.
>>
>> Insert " at the beginning of the next line (with some indentation
>> heuristics) to continue the quote on the next line.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#impact-on-existing-code>Impact
>> on existing code
>>
>> Failing to close a string literal before the end of the line is currently a
>> syntax error, so no valid Swift code should be affected by this change.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#future-directions-for-multiline-string-literals>Future
>> directions for multiline string literals
>>
>> We could permit comments before encountering a continuation quote to be
>> counted as whitespace, and permit empty lines in the middle of string
>> literals. This would allow you to comment out whole lines in the literal.
>>
>> We could allow you to put a trailing backslash on a line to indicate that
>> the newline isn't "real" and should be omitted from the literal's contents.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#future-directions-for-string-literals-in-general>Future
>> directions for string literals in general
>>
>> There are other issues with Swift's string handling which this proposal
>> intentionally does not address:
>>
>> Reducing the amount of double-backslashing needed when working with regular
>> expression libraries, Windows paths, source code generation, and other tasks
>> where backslashes are part of the data.
>>
>> Alternate delimiters or other strategies for writing strings with "
>> characters in them.
>>
>> Accommodating code formatting concerns like hard wrapping and commenting.
>>
>> String literals consisting of very long pieces of text which are best
>> represented completely verbatim, with minimal alteration.
>>
>> This section briefly outlines some future proposals which might address
>> these issues. Combined, we believe they would address most of the string
>> literal use cases which Swift is currently not very good at.
>>
>> Please note that these are simply sketches of hypothetical future designs;
>> they may radically change before proposal, and some may never be proposed at
>> all. Many, perhaps most, will not be proposed for Swift 3. We are sketching
>> these designs not to propose and refine these features immediately, but
>> merely to show how we think they might be solved in ways which complement
>> this proposal.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#string-literal-modifiers>String
>> literal modifiers
>>
>> A string literal modifier is a cluster of identifier characters which goes
>> before a string literal and adjusts the way it is parsed. Modifers only
>> alter the interpretation of the text in the literal, not the type of data it
>> produces; for instance, there will never be something like the
>> UTF-8/UTF-16/UTF-32 literal modifiers in C++. Uppercase characters enable a
>> feature; lowercase characters disable a feature.
>>
>> Modifiers can be attached to both single-line and multiline literals, and
>> could also be attached to other literal syntaxes which might be introduced
>> in the future. When used with multiline strings, only the starting quote
>> needs to carry the modifiers, not the continuation quotes.
>>
>> Modifiers are an extremely flexible feature which can be used for many
>> proposes. Of the ideas listed below, we believe the e modifier is an urgent
>> addition which should be included in Swift 3 if at all possible; the others
>> are less urgent and most of them could be deferred, or at least added later
>> if time allows.
>>
>> Escape disabling: e"\\\" (string with three backslash characters)
>>
>> Fine-grained escape disabling: i"\(foo)\n" (the string \(foo) followed by a
>> newline); eI"\(foo)\n" (the contents of foo followed by the string \n),
>> b"\w+\n" (the string \w+ followed by a newline)
>>
>> Alternate delimiters: _ has no lowercase form, so it could be used to allow
>> strings with internal quotes: _"print("Hello, world!")"_, __"print("Hello,
>> world!")"__, etc.
>>
>>
> This is interesting and perhaps could be applied per line with the
> continuation quote syntax:
>
> let xml = _"<?xml version="1.0"?>
> _"<catalog>
> _" <book id="bk101" empty="">
> " <author>\(author)</author>
> _" </book>
> _"</catalog>
> This would allow individual lines to retain the ability to do escaping and
> interpolation without affecting the whole string, just like the author line
> in the example above. This is also very easy to insert into editors just like
> the standard continuation quote syntax. Or perhaps we could just “escape”
> each string:
>
> let xml = \"<?xml version="1.0"?>
> \"<catalog>
> \" <book id="bk101" empty="">
> " <author>\(author)</author>
> \" </book>
> \"</catalog>
>
>> Whitespace normalization: changes all runs of whitespace in the literal to
>> single space characters; this would allow you to use multiline strings
>> purely to improve code formatting.
>>
>> alert.informativeText =
>> W"\(appName) could not typeset the element “\(title)” because
>> "it includes a link to an element that has been removed from this
>> "book."
>> Localization:
>>
>> alert.informativeText =
>> LW"\(appName) could not typeset the element “\(title)” because
>> "it includes a link to an element that has been removed from this
>> "book."
>> Comments: Embedding comments in string literals might be useful for literals
>> containing regular expressions or other code.
>>
>> Eventually, user-specified string modifiers could be added to Swift, perhaps
>> as part of a hygienic macro system. It might also become possible to change
>> the default modifiers applied to literals in a particular file or scope.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#heredocs-or-other-verbatim-string-literal-features>Heredocs
>> or other "verbatim string literal" features
>>
>> Sometimes it really is best to just splat something else down in the middle
>> of a file full of Swift source code. Maybe the file is essentially a
>> template and the literals are a majority of the code's contents, or maybe
>> you're writing a code generator and just want to get string data into it
>> with minimal fuss, or maybe people unfamiliar with Swift need to be able to
>> edit the literals. Whatever the reason, the normal string literal syntax is
>> just too burdensome.
>>
>> One approach to this problem is heredocs. A heredoc allows you to put a
>> placeholder for a literal on one line; the contents of the literal begin on
>> the next line, running up to some delimiter. It would be possible to put
>> multiple placeholders in a single line, and to apply string modifiers to
>> them.
>>
>> In Swift, this might look like:
>>
>> print(#to("---") + e#to("END"))
>> It was a dark and stormy \(timeOfDay) when
>> ---
>> the Swift core team invented the \(interpolation) syntax.
>> END
>> Another possible approach would be to support traditional multiline string
>> literals bounded by a different delimiter, like """. This might look like:
>>
>> print("""
>> It was a dark and stormy \(timeOfDay) when
>> """ + e"""
>> the Swift core team invented the \(interpolation) syntax.
>> """)
>> Although heredocs could make a good addition to Swift eventually, there are
>> good reasons to defer them for now. Please see the "Alternatives considered"
>> section for details.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#first-class-regular-expressions>First-class
>> regular expressions
>>
>> Members of the core team are interested in regular expressions, but they
>> don't want to just build a literal that wraps PCRE or libicu; rather, they
>> aim to integrate regexes into the pattern matching system and give them a
>> deep, Perl 6-style rethink. This would be a major effort, far beyond the
>> scope of Swift 3.
>>
>> In the meantime, the e modifier and perhaps other string literal modifiers
>> will make it easier to specify regular expressions in string literals for
>> use with NSRegularExpression and other libraries accessible from Swift.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#alternatives-considered>Alternatives
>> considered
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#requiring-no-continuation-character>Requiring
>> no continuation character
>>
>> The main alternative is to not require a continuation quote, and simply
>> extend the string literal from the starting quote to the ending quote,
>> including all newlines between them. For example:
>>
>> let xml = "<?xml version=\"1.0\"?>
>> <catalog>
>> <book id=\"bk101\" empty=\"\">
>> <author>\(author)</author>
>> </book>
>> </catalog>"
>> This alternative is extensively discussed in the "Rationale" section above.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#skip-multiline-strings-and-just-support-heredocs>Skip
>> multiline strings and just support heredocs
>>
>> There are definitely cases where a heredoc would be a better solution, such
>> as generated code or code which is mostly literals with a little Swift
>> sprinkled around. On the other hand, there are also cases where multiline
>> strings are better: short strings in code which is meant to be read. If a
>> single feature can't handle them both well, there's no shame in supporting
>> the two features separately.
>>
>> It makes sense to support multiline strings first because:
>>
>> They extend existing syntax instead of introducing new syntax.
>>
>> They are much easier to parse; heredocs require some kind of mode in the
>> parser which kicks in at the start of the next line, whereas multiline
>> string literals can be handled in the lexer.
>>
>> As discussed in "Rationale", they offer better diagnostics, code formatting,
>> and visual scannability.
>>
>>
>> <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#use-a-different-delimiter-for-multiline-strings>Use
>> a different delimiter for multiline strings
>>
>> The initial suggestion was that multiline strings should use a different
>> delimiter, """, at the beginning and end of the string, with no continuation
>> characters between. Like heredocs, this might be a good alternative for
>> certain use cases, but it has the same basic flaws as the "no continuation
>> character" solution.
>>
>
>> That might be a useful document to have, but I worry that we'll end up
>> seeing the string feature proposals signed in triplicate, sent in, sent
>> back, queried, lost, found, subjected to public inquiry, lost again, and
>> finally buried in soft peat for three months and recycled as firelighters,
>> all to end up in with basically the same proposals but with slightly
>> different keywords. Not every decision needs that level of explicit, deep
>> documentation. Some things you can think about, experiment with, discuss,
>> and do.
>
>
> Yeah, I think you are probably right here. I actually think with the
> additions to your proposal it covers almost all of the other suggestions
> regarding string literals or at least mentions them as alternatives. Thanks
> so much for spending the time putting together the proposal! I have no idea
> how you find the time to follow and participate in what seems like every
> Swift evolution thread, but it’s awesome!
>
> Tyler
>
>
>> --
>> Brent Royal-Gordon
>> Architechies
>>
>> _______________________________________________
>> swift-evolution mailing list
>> [email protected] <mailto:[email protected]>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> [email protected]
> https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution