> As far as mixed whitespace, I think the only sane thing to do would be to
> only allow leading tabs *or* spaces. Mixing tabs and spaces in the leading
> whitespace would be a syntax error. All lines in the string would need to
> use tabs or all lines use spaces, you could not have one line with tabs and
> another with spaces. This would keep the compiler out of the business of
> making any assumptions or guesses, would not be a problem often, and would be
> very easy to fix if it ever happens accidentally.
The sane thing to do would be to require every line be prefixed with *exactly*
the same sequence of characters as the closing delimiter line. Anything else
(except perhaps a completely blank line, to permit whitespace trimming) would
be a syntax error.
But take a moment to consider the downsides before you leap to adopt this
solution.
1. You have introduced tab-space confusion into the equation.
2. You have introduced trailing-newline confusion into the equation.
3. The #escaped and #marginStripped keywords are now specific to multiline
strings; #escaped in particular will be attractive there for tasks like
regexes. You will have to invent a different syntax for it there.
4. This form of `"""` is not useful for not having to escape `"` in a
single-line string; you now have to invent a separate mechanism for that.
5. You can't necessarily look at a line and tell whether it's code or string.
And—especially with the #escaped-style constructs—the delimiters don't
necessarily "pop" visually; they're too small and easy to miss compared to the
text they contain. In extremis, you actually have to look at the entire file
from top to bottom, counting the `"""`s to figure out whether you're in a
string or not. Granted, you *usually* can tell from context, but it's a far cry
from what continuation quotes offer.
6. You are now forcing *any* string literal of more than one line to include
two extra lines devoted wholly to the quoting syntax. In my Swift-generating
example, that would change shorter snippets like this:
code += "
" static var messages: [HTTPStatus: String] = [
""
Into things like this:
code += """
static var messages: [HTTPStatus: String] = [
"""
To my mind, the second syntax is actually *heavier*, despite not requiring
every line be marked, because it takes two extra lines and additional
punctuation.
7. You are also introducing visual ambiguity into the equation—in the above
example, the left margin is now ambiguous to the eye (even if it's not
ambiguous to the compiler). You could recover it by permitting non-whitespace
prefix characters:
code += """
|
| static var messages: [HTTPStatus: String] = [
|
|"""
...but then we're back to annotating every line, *plus* we have the leading and
trailing `"""` lines. Worst of both worlds.
8. In longer examples, you are dividing the expression in half in a way that
makes it difficult to read. For instance, consider this code:
socket.send(
""" #escaped #marginStripped
<?xml version="1.0"?>
<catalog>
<book id="bk101" empty="">
<author>\(author)</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications with
XML.</description>
</book>
</catalog>
""".data(using: NSUTF8StringEncoding))
The effect—particularly with even larger literals than this—is not unlike
pausing in the middle of reading an article to watch a movie. What were we
talking about again?
This problem is neatly avoided by a heredoc syntax, which keeps the expression
together and then collects the string below it:
socket.send(""".data(using: NSUTF8StringEncoding))
<?xml version="1.0"?>
<catalog>
<book id="bk101" empty="">
<author>\(author)</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications with
XML.</description>
</book>
</catalog>
"""
(I'm assuming there's no need for #escaped or #marginStripped; they're both
enabled by default.)
* * *
Let's actually talk about heredocs. Leaving aside indentation (which can be
applied to either feature) and the traditional token choices (which can be
changed), I think these are the pros of heredocs compared to Python
triple-quotes:
H1: Doesn't break up expressions, as discussed above.
H2: Literal content formatting is completely unaffected by code formatting,
including the first and last lines.
Here are the pros of Python triple-quotes compared to heredocs:
P1: Simpler to explain: "like a string literal, but really big".
P2: Lighter syntactic weight, enough to make`"""` usable as a single-line
syntax.
P3: Less trailing-newline confusion.
(There is one other difference: `"""` is simpler to parse, so we might be able
to get it in Swift 3, whereas heredocs probably have to wait for Swift 4. But I
don't think we should pick one feature over another merely so we can get it
sooner. It's one thing if you plan to eventually introduce both features, as I
plan to eventually have both continuation quotes and heredocs, to introduce
each of them as soon as you can; it's another to actually choose one feature
over another specifically to get something you can implement sooner.)
But the design you're discussing trades P2 and P3—and frankly, with the
mandatory newlines, part of P1—away in an attempt to get H2. So we end up
deciding between these two selling points:
* This triple-quotes design: Simpler to explain.
* Heredocs: Doesn't break up expressions.
Simplicity is good, but I really like the code reading benefits of heredocs.
Your code is your code and your text is your text. The interface between them
is a bit funky, but within their separate worlds, they're both pretty nice.
* * *
Either way, heredocs or multiline-only triple quotes could be tweaked to
support indentation by using the indentation of the end delimiter. But as I
explained above, I don't think that's a great idea for either triple quotes
*or* heredocs—the edge of the indentation is not visually well defined enough.
That's why I came to the conclusion that trying to cram every multiline literal
into one syntax is trying to cram too many peg shapes into one hole shape.
Indentation should *only* be supported by a dedicated syntax which is also
designed for the smallest multiline strings, where indentation support is most
useful. A separate feature without indentation support should handle longer
strings, where the length alone is so disruptive to the flow of your code that
there's just no point even trying to indent them to match (and the break with
normal indentation itself assists you in finding the end of the string).
And I think that the best choice for the first feature is continuation quotes,
and for the second is heredocs. Triple-quote syntaxes—either Python's or this
modification—are jacks of all trades, but masters of none.
--
Brent Royal-Gordon
Architechies
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution