I've written up a proposal for multi-line string literals. Before proposing it
officially, I would like to get some informal feedback.
How can I propose it officially? Do I have to convert it to Markdown? I have no
idea how to create a Markdown version of this, with all the quotes and funny
characters in it ;)
-Michael
***
MULTI-LINE STRING LITERALS
- Proposal: SE-xxxx
- Author: Michael Peternell
- Status:
- Review manager:
INTRODUCTION
Multi-line string literals allow text that may be multiple lines long, to be
included verbatim into a string literal. The string may even contain quote
characters (" or '), and they don't have to be specially escaped.
MOTIVATION
Including many lines of text in a program often looks not so well, e.g. a
JSON-string where ever quote needs to be escaped:
"{\"response\":{\"result\":\"OK\"}}". With multi-line string literals, we can
write """{"response":{"result":"OK"}}""" - note that every valid JSON can be
pasted as-is into a """3-quote string literal""", because 3 quotes (""") cannot
appear in a valid JSON. (Why would you want to have a JSON-string in a program?
Maybe you are writing unit tests for a JSON parser.) Another usage example is
below.
Some people had concerns that a string block may break the indentation of the
code. E.g.
// some deeply indented code
doSomeStuff(2, 33.1)
print("""Usage: \(program_name) <PARAM-X> <PARAM-Y> filename
Example: \(program_name) 3 1 countries.csv
This will print the 1st column of the 3rd non-empty non-header line from
countries.csv
""" )
exit(2)
That's the reason why there is also a HEREDOC-syntax in the proposal that can
solve this problem. The example can be rewritten as:
// some deeply indented code
doSomeStuff(2, 33.1)
print(<<USAGE_END)
Example: \(program_name) 3 1 countries.csv
This will print the 1st column of the 3rd non-empty
non-header line from countries.csv
USAGE_END
exit(2)
This works unambiguously, as long as you don't mix tabs and spaces in your
source code file.
PROPOSED SOLUTION
This proposal introduces three new forms of a String literal:
let INTERPOLATION = "String interpolation"
1. The """Python-style string literal. 3 Quotes (") at the beginning, 3 Quotes
at the end, and Swift \(INTERPOLATION) is possible."""
2. The <<HERE_DOC, the string literal starts on the next line:
A hereDoc may contain multiple lines. Leading space on each
line is automatically truncated if the HERE_DOC delimiter
is also indented. \(INTERPOLATION) is possible.
HERE_DOC
3. A <<'HERE_DOC' with single quotes around them.
This is almost the same as a heredoc without single quotes, but text is
included as-is.
You may include \ or " or ' or whatever (\") is just a backslash followed by a
double quote.
The leading space rule is the same as for the other HERE_DOC.
Swift String interpolation is not possible here.
HERE_DOC
DETAILED DESIGN
The first type of String (the """Python-style multiline string""") behaves
exactly like the "ordinary string literal", except for a few differences:
- a line-break doesn't result in an error, but is normally integrated into the
strings value
- an included " doesn't end the string and does not need to be quoted.
- If you want to include """ in the string, you have to write ""\". This is a
rare use-case, and if you really need to do that, you may as well use one of
the HERE_DOC-styles instead.
The second type of String (the <<HERE_DOC with string-interpolation) include
all lines after the line where HERE_DOC appears, until the HERE_DOC delimiter
line. The last newline before the HERE_DOC delimiter line is automatically
truncated from the string; otherwise it would not be possible to create a
HERE_DOC string literal that does not end with a newline character. If you want
to end the string literal with a newline character, you need an empty line
before the HERE_DOC delimiter line (as in the "usage"-example above). The
HERE_DOC delimiter line contains optional whitespace at the beginning, followed
by the HERE_DOC token. If the line contains leading whitespace, all lines
within the literal have to contain exactly the same amount of leading
whitespace. E.g. if the HERE_DOC-line contains 4 spaces, followed by
"HERE_DOC", each line in the string literal has to start with 4 spaces as well
(using one tab instead, or less white space, would be a parse error.) Empty
lines within the string literal are exempt from this requirement. They just
translate to "\n". (Fineprint: if the HEREDOC delimiter line is "\t\tHEREDOC"
and one of the lines in the string literal are just " " then it is not
decidable wether the line should translate to "\n" (if "\t" is like " " or
larger) or to " \n" (if "\t" is like " "), so this would also result in a
parse error. The whitespace before the HERE_DOC on the HERE_DOC delimiter line
must contain only spaces or only tabs, but not a mixture of both. These rules
are a bit complicated for the language implementor, but for the user of the
feature, they have an important advantage: if the code compiles, the string
literal will behave as expected. Just don't mix tabs and spaces and you'll be
fine.)
The third type of string is exactly the same as the second type, with the only
difference that the <<HERE_DOC syntax is changed to <<'HERE_DOC', and that all
string interpolation and escape sequences are disabled within the literal. The
end token is still HERE_DOC without single quotes, and not 'HERE_DOC'. The
rules about leading whitespace on the HERE_DOC delimiter line are the same as
for the second type.
For the HERE_DOC token, everything that is a valid variable name is allowed, so
<<hello, <<END_OF_XML are all valid, but <<2442 is not. Furthermore, ever token
that matches /[a-zA-Z]+/ is also valid, so <<class should be okay as well. (The
usual practice is to use SCREAMING_SNAKE_CASE tokens as delimiters.)
IMPACT ON EXISTING CODE
This is an add-on feature. Code that uses these multi-line string literals
didn't even compile with previous versions of Swift, so no existing code can
break because of this change.
ALTERNATIVES CONSIDERED
1. Just copy all String-handling rules from Perl ;)
2. String literals of the form
_"text text
"text text"_
I don't like the continuation quote, and so it doesn't solve the problem that I
am trying to solve with this proposal. The same if true for a string literal
where you would have to start each line with \\ .
3. eXML"a string literal that starts with e, followed by some token, and that
ends with a quote (") followed by the same token"XML. This has the advantage,
that you can put anything between the start and the end, and that you can
choose a delimiter. It's a flexible solution. I prefer HERE_DOC's though,
because they are an already well-known programming language construct.
4. Do nothing, and just use string concatenation: "this string\n"+
"with newlines in it\n" works well. Maybe the optimizer can optimize this away
anyways, so there wouldn't even be a performance cost.
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution