Hi,

I propose adding multiline string literals to Swift 3.

I have written up a proposal as a Github Gist, here:
https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd>

Can someone with write-access push it to the swift-evolution repository, please?

Thanks..

Regards,
Michael

Multiline String literals

Proposal: SE-NNNN
Author: Michael Peternell <https://www.github.com/michaelpeternell>
Status: Awaiting review 
<https://github.com/apple/swift-evolution/blob/master/0000-template.md#rationale>
Review manager: TBD
 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#introduction>Introduction

Multi-line string literals allow text that may be multiple lines long, to be 
included verbatim into a string literal. The string may even contain quote 
characters (" or '), and they don't have to be especially escaped.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#motivation>Motivation

Including many lines of text in a program often looks not so well, e.g. a 
JSON-string where every quote needs to be escaped: 
"{\"response\":{\"result\":\"OK\"}}". With multi-line string literals, we can 
write """{"response":{"result":"OK"}}""" - note that every valid JSON can be 
pasted as-is into a """3-quote string literal""", because 3 quotes (""") cannot 
appear in a valid JSON. (Why would you want to have a JSON-string in a program? 
Maybe you are writing unit tests for a JSON parser.) Another usage example is 
below.

Some people had concerns that a string block may break the indentation of the 
code. E.g.

            // some deeply indented code
            doSomeStuff(2, 33.1)
            print("""Usage: \(program_name) <PARAM-X> <PARAM-Y> filename
Example: \(program_name) 3 1 countries.csv
This will print the 1st column of the 3rd non-empty non-header line from
countries.csv
""")
            exit(2)
First, you don't have to use them. You can still use the former way of using 
normal double quote characters. But in order to fix the problem with multiline 
strings, you may use a HEREDOC-syntax. The example can be rewritten as:

            // some deeply indented code
            doSomeStuff(2, 33.1)
            print(<<USAGE_END)
                Example: \(program_name) 3 1 countries.csv
                This will print the 1st column of the 3rd non-empty
                non-header line from countries.csv

                USAGE_END
            exit(2)
This works unambiguously, as long as you don't mix tabs and spaces in your 
source code file.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#proposed-solution>Proposed
 solution

This proposal introduces four new forms of a String literal:

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#the-python-like-string-literal>The
 Python-like string literal

Everything between """ and """ belongs to the string. Escape sequences (\n, \t, 
\\, etc.) and string interpolation work as usual. The rules are the same as for 
a normal double quoted (") string literal. However, a single " doesn't need to 
be escaped, and therefore a string literal like """<a href="#" 
onclick="openABCWindow(22);return false">details</a>""" would be valid Swift. 
Newline, spaces and tabs are not treated in any special way, so the following 
string

"""
 test
 test"""
could also be written as "\n test\n test".

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#the-heredoc-with-string-interpolation>The
 HEREDOC with string interpolation

The HEREDOC starts with a <<, followed by an identifier. The string literal 
starts in the next line and ends in the line that contains the HEREDOC 
identifier. Example:

print(<<USAGE)
    funnyProgram [-v] [-h]
    This program tells a joke. Possible options:
      -v | --version ... Shows version information
      -h | --help ...... Shows this list of options
USAGE
exit(2)
This string literal does not contain a trailing newline character. Otherwise, 
it would not be possible to create a HEREDOC-literal without trailing newline 
(The print()-function will add a newline though.) In the example above, each 
line is indented by 4 spaces. If you want to strip leading spaces on each line 
you have to indent the ending identifier with the same amount of whitespace. 
Thus, a better example would look like this:

print(<<USAGE)
    funnyProgram [-v] [-h]
    This program tells a joke. Possible options:
      -v | --version ... Shows version information
      -h | --help ...... Shows this list of options
    USAGE
exit(2)
The leading indentation has to be all-spaces or all-tabs, but never a mixture 
of them.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#the-heredoc-without-string-interpolation>The
 HEREDOC without string interpolation

To have no string interpolation or escape sequences at all, you can add single 
quotes around the HEREDOC-identifier. Example:

print(<<'USAGE')
    funnyProgram [-v] [-h]
    This program tells a joke. Possible options:
      -v | --version ... Shows version information
      -h | --help ...... Shows this list of options
         \( ^^ don't worry, be happy :-)
    USAGE
exit(2)
In all other regards, this string literal behaves the same as the HEREDOC with 
string interpolation.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#guillemets-and-english-typographical-quotes>«Guillemets»
 and English “typographical quotes”

Swift already allows emojis in names of all sort. The following code is valid 
Swift:

for 🐟 in sea {
    🐟.makeSushi()
}
Swift is a playful language. Allowing «Guillemets» and “typographical quotes” 
is the next logical step. To allow for both strings with interpolation and 
strings without interpolation, one should allow string interpolation and escape 
sequences while the other should not. I propose that «Guillemets» are used for 
strings without interpolation, so «\» is a valid string literal consisting of 
one escape character. “"\(localizedName)"” is a string containing a double 
quote character (") followed by whatever the contents of localizedName is, 
followed by another double quote character ("). (Note that the reverse is 
already possible: "“\(localizedName)”".)

These literals behave the same as the Python-like string literal above.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#detailed-design>Detailed
 design

Python-strings, Guillemets and English typographical quotes are already 
described in detail above. The only thing that may cause misunderstandings are 
the HEREDOCs.

The following code should be invalid:

    print(<<EOT)
    hello world
        is this a proper string literal?
        EOT
because the ending EOT has more indentation than one of the lines in the string 
literal.

The following code is valid though:

    // I replaced spaces with _underscores_ below:
____print(<<EOT)
________hello world
____
________is this a valid string?
________EOT
Although the second line has less indentation than the other lines, this is not 
a problem because the line is empty.

The following string literal contains 3 spaces in the second line, and it ends 
with a single newline character:

    // I replaced spaces with _underscores_ below:
____print(<<EOT)
________hello world
___________
________is this a valid string?

________EOT
The following string literal is invalid:

    // I replaced spaces with _underscores_ below.
    // I replaced tab characters with TAB! below.
____print(<<EOT)
________hello world
____
TAB!____is this a valid string
________EOT
    // => no
With tabs configured to look exactly like 4 spaces, the code above looks valid 
but it is not. There is no sane way to decide wether (TAB + 4 spaces) is (less 
than, the same amount, or more than) 8 spaces. Such code should be discarded.

The authors opinion is that tabs and spaces should not be mixed, and that this 
will not be a problem in almost all use cases.

The following HEREDOC is also invalid, although the amount of whitespace is 
consistent:

    // I replaced spaces with _underscores_ below.
    // I replaced tab characters with TAB! below.
____print(<<EOT)
____TAB!hello world
____TAB!
____TAB!is this a valid string?
____TAB!__good question.
____TAB!EOT
Tabs and spaces just shouldn't be mixed. The following snippet is fine though, 
although inconsistent in it's tab/spaces use:

    // I replaced spaces with _underscores_ below:
    // I replaced tab characters with TAB! below.
____print(<<EOT)
________hello world
________
________is this a valid string?
________TAB!Yes, indeed, even though this line started with a tab
________EOT
In the example above, the leading space on each line consists of 8 spaces. 
Everything after these 8 spaces should become part of the string literal as-is.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#impact-on-existing-code>Impact
 on existing code

This is an add-on feature. Code that uses these multi-line string literals 
didn't even compile with previous versions of Swift, so no existing code can 
break because of this change.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#alternatives-consiedered>Alternatives
 consiedered

Introduce just the """Python-like string literal"""

Do nothing.

 
<https://gist.github.com/michaelpeternell/a4da4185de78808f4575a836c50debbd#rationale>Rationale

...
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to