[ string literals ] Extending the escape language (was: String literals: some principles)

Brian Goetz Tue, 07 May 2019 15:15:44 -0700

> TL;DR: Good framework; must also account for the
> rectangle extraction rule (RER).  A unified escape
> sublanguage (ESL) is highly desirable, and I propose
> adding <\ > and <\ LT WS*> as escapes for space
> and for null string.  The existing \ char is OK, and
> should be "fattened" as a separate feature.  I note
> some issues with <\ u X X X X>.


Agree in general with the desire to extend ESL with some whitespace sequences, 
though I take some issues with the syntax on \<nl> and \<space>.  Some 
alternate ideas regarding \uxxxx.  

First, unicode escapes.  Alex pointed out offline that we had worked our way 
into a linear thinking trap (again).  In the first round, because we were 
focused on raw strings, we turned off \uxxxx processing in the body of a raw 
string, which raised the question of “how do we turn it back on.”  And also 
that, while we use the same escape character for both, they occupy very 
different places in the language; the ESL is purely about string literals, 
whereas \uxxxx is purely a lexing concern.  

His recommendation, which (now that its been explained to me) I strongly agree 
with, is: let’s not have this feature touch unicode processing at all.  Let’s 
just leave unicode processing as is, using \uxxxx, whether in code, SLSLs, 
MLSLs, and any future “raw” SLs.  The similarly between \n and \uxxxx is purely 
coincidental. And if we really want the characters "\u0000” in a string 
literal, well, we know how to escape the \.  

Which brings us to \<eol> and \<space>.  My main complaint here is that I am 
really uncomfortable using \<space> for “literal space”, because at the end of 
the line, one cannot differentiate between \<eol> and \<space> when reading the 
code.  Alternatives include \_, or \s, or \., or … many others.

[ string literals ] Extending the escape language (was: String literals: some principles)

Reply via email to