Re: [swift-evolution] [Review] SE-0168: Multi-Line String Literals

Jarod Long via swift-evolution Wed, 12 Apr 2017 17:52:08 -0700

Thanks Brent, I really appreciate the thoughtful response. Apologies for 
anything I overlooked previously.


I agree with most of your points, although I still find myself preferring the 
common-whitespace logic and leading/trailing newline stripping when considering 
the pros and cons. It doesn't seem likely to gain traction though, so I won't 
spend more time on it.

Thanks again!

Jarod

On Apr 12, 2017, 16:35 -0700, Brent Royal-Gordon <[email protected]>, 
wrote:
> > On Apr 12, 2017, at 11:58 AM, Jarod Long via swift-evolution 
> > <[email protected]> wrote:
> >
> > On a separate note, I'd like to bring up the de-indentation behavior I 
> > described earlier again. I still feel that having the position of the 
> > closing delimiter determine how much whitespace is de-indented is not very 
> > natural or intuitive, since I don't think there is any precedent in 
> > standard Swift styling to indent a closing delimiter to the same level as 
> > its content.
>
> String literal delimiters are very different from other delimiters because 
> they switch the parser into a different mode where characters are interpreted 
> in vastly different ways, and every character has a significant meaning. For 
> instance, it's good practice to put either a space, or a newline and 
> indentation, between array and dictionary literal delimiters or curly 
> brackets and their content, but this is not possible with a string literal 
> because the space would count as part of the content. This is the same way: 
> you can't outdent because whitespace is significant inside a string literal, 
> so it would change the meaning.
>
> I think that this probably seems way weirder on paper than it really is in 
> practice. I recommend that you try it and see how it feels.
>
> > Stripping the most common whitespace possible from each line seems to be a 
> > much more intuitive and flexible solution in terms of formatting, and it's 
> > still compatible with the proposed formatting if that's anyone's preference.
>
> I discuss this at length in the Rationale section for indentation stripping. 
> If you'll forgive me for quoting myself:
>
> > We could instead use an algorithm where the longest common whitespace 
> > prefix is removed from all lines; in well-formed code, that would produce 
> > the same behavior as this algorithm. But when not well-formed—when one line 
> > was accidentally indented less than the delimiter, or when a user mixed 
> > tabs and spaces accidentally—it would lead to valid, but incorrect and 
> > undiagnosable, behavior. For instance, if one line used a tab and other 
> > lines used spaces, Swift would not strip indentation from any of the lines; 
> > if most lines were indented four spaces, but one line was indented three, 
> > Swift would strip three spaces of indentation from all lines. And while you 
> > would still be able to create a string with all lines indented by indenting 
> > the closing delimiter less than the others, many users would never discover 
> > this trick.
> Let me provide an example to illustrate what I'm talking about. Suppose you 
> want to say this:
>
> ····xml += """\↵
> ············<book id="bk\(id)">↵
> ················<author>\(author)</author>↵
> ················<title>\(title)</title>↵
> ················<genre>\(genre)</genre>↵
> ················<price>\(price)</price>↵
> ············</book>↵
> ············"""↵
>
> But instead, you miss just one little insignificant character:
>
> ····xml += """\↵
> ···········<book id="bk\(id)">↵
> ················<author>\(author)</author>↵
> ················<title>\(title)</title>↵
> ················<genre>\(genre)</genre>↵
> ················<price>\(price)</price>↵
> ············</book>↵
> ············"""↵
>
> This is the kind of mistake you will almost certainly never notice by hand 
> inspection. You probably can't see the mistake without looking very 
> carefully—and this is with invisible whitespace replaced with visible dots! 
> But in the least-common-whitespace design, it's perfectly valid, and 
> generates this:
>
> <book id="bk\(id)">↵
> ·····<author>\(author)</author>↵
> ·····<title>\(title)</title>↵
> ·····<genre>\(genre)</genre>↵
> ·····<price>\(price)</price>↵
> ·</book>↵
> ·
>
> That is not what you wanted. I'm pretty sure it's almost *never* what you 
> want. But it's valid, it's going to be accepted, and it's going to affect 
> every single line of the literal in a subtle way. (Plus the next line, thanks 
> to that trailing space!) It's not something we can warn about, either, 
> because it's perfectly valid. To fix it, you'll have to notice it's wrong and 
> then work out why that happened.
>
> In the proposed design, on the other hand, we have a single source of truth 
> for indentation: the last line tells us how much we should remove. That means 
> we can actually call a mistake a mistake. The very same example, run through 
> the proposed algorithm, produces this, plus a warning on the first line:
>
> ···········<book id="bk\(id)">↵
> ····<author>\(author)</author>↵
> ····<title>\(title)</title>↵
> ····<genre>\(genre)</genre>↵
> ····<price>\(price)</price>↵
> </book>↵
>
> Notice that there is only one line that comes out incorrectly, that it's the 
> line which has the mistake, that the mistake is large and noticeable in the 
> output, *and* that we were also able to emit a compile-time warning pointing 
> to the exact line of code that was mistaken. That outcome is night-and-day 
> better.
>
> Now consider mixed tabs and spaces:
>
> ····xml += """\↵
> ············<book id="bk\(id)">↵
> ················<author>\(author)</author>↵
> ········⇥   ····<title>\(title)</title>↵
> ················<genre>\(genre)</genre>↵
> ················<price>\(price)</price>↵
> ············</book>↵
> ············"""↵
>
> (I'm assuming a tab stop of 4, so mentally adjust that example if you need 
> to.)
>
> With your design, the compiler happily removes the common whitespace and 
> writes code which does this:
>
> ····<book id="bk\(id)">↵
> ········<author>\(author)</author>↵
> ⇥   ····<title>\(title)</title>↵
> ········<genre>\(genre)</genre>↵
> ········<price>\(price)</price>↵
> ····</book>↵
> ····
>
> Once again, every line is affected—including lines after this snippet, since 
> there are spaces after the last newline. Once again, there can be no warning. 
> You'll need to notice the problem and then figure out what happened.
>
> By contrast, with the proposed design, you get this, plus a warning:
>
> <book id="bk\(id)">↵
> ····<author>\(author)</author>↵
> ········⇥   ····<title>\(title)</title>↵
> ····<genre>\(genre)</genre>↵
> ····<price>\(price)</price>↵
> </book>↵
>
> Once again, the only line that's affected is the bad line, *and* you get a 
> warning. In this case, I think the warning could probably point you to the 
> exact *character* that causes the problem.
>
> Basically, common-whitespace-prefix makes the compiler act like a dumb 
> computer that does what you say, not what you want. The proposed algorithm 
> makes the compiler act like a smart human that notices when you ask for 
> something that doesn't make sense and tells you about the problem.
>
> (Also note how, if you want a trailing newline, you still end up having the 
> delimiter on a separate line aligned with the other text anyway! Stripping 
> the common whitespace prefix in practice still ends up looking exactly the 
> same as what you object to.)
>
> > The only functional limitation that I see is that if you can't have leading 
> > whitespace in the interpreted string if you actually want that. That 
> > doesn't seem like a very important use case to me,
>
> We showed an example of this being done in the Rationale section, and it was 
> a *very* plausible example. I don't think it's rare or unnecessary at all; I 
> think it's a really important use case, particularly for generating 
> pretty-printed code or markup.
>
> > but if we think it is important, it could be supported by something like 
> > having a backslash in the leading whitespace at the location where it 
> > should be preserved from.
>
> There are good reasons not to allow backslashing of several different 
> varieties of whitespace, and people were really unhappy with designs that 
> required them to modify every line of text. I think this is a non-starter.
>
> > If we're set on the proposed behavior, have we considered what happens if 
> > the closing delimiter goes beyond the non-whitespace content of the string?
> >
> > let string = """
> >     aa
> >     bb
> >     cc
> >      """
> >
> > Does it strip the non-whitespace characters? Does it strip up to the 
> > non-whitespace characters? Does it generate an error?
>
> It strips nothing and generates a warning on each offending line (but not an 
> error, because whitespace problems are usually minor enough that there's no 
> need to interrupt your debugging to fix some indentation). This was covered 
> in the proposal.
>
> (In an example like this, where every line is less indented than the 
> delimiter, we might emit a different warning suggesting that the delimiter's 
> indentation is wrong. That's a QoI issue, though, not the kind of thing we 
> need to cover in a proposal.)
>
> --
> Brent Royal-Gordon
> Architechies
>

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Review] SE-0168: Multi-Line String Literals

Reply via email to