The way that Lua does raw strings is also fairly nifty. Check out
http://www.lua.org/manual/5.2/manual.html, section 3.1, or, in short:
- Strings can be delimited by [===[, with any number of equals signs.
The corresponding closing delimiter must match the original number of
equals signs.
- No escaping is done.
- Any kind of end-of-line sequence (i.e. \r and \n in any order) is
converted to just a newline.
- It can run for multiple lines.
--Andrew D
On Thu, Sep 19, 2013 at 10:28 PM, Kevin Cantu m...@kevincantu.org wrote:
I think designing good traits to support all these text implementations is
far more important than whatever hungarian notation is preferred for
literals.
Kevin
On Thu, Sep 19, 2013 at 2:50 PM, Martin DeMello
martindeme...@gmail.comwrote:
Ah, good point. You could fix it by having a very small whitelist of
acceptable delimiters, but that probably takes it into overcomplex
territory.
martin
On Thu, Sep 19, 2013 at 2:46 PM, Kevin Ballard ke...@sb.org wrote:
As I just responded to Masklinn, this is ambiguous. How do you lex `do
R{foo()}`?
-Kevin
On Sep 19, 2013, at 2:41 PM, Martin DeMello martindeme...@gmail.com
wrote:
Yes, I figured R followed by a non-alphabetical character could serve
the same purpose as ruby's %char.
martin
On Thu, Sep 19, 2013 at 2:37 PM, Kevin Ballard ke...@sb.org wrote:
I didn't look at Ruby's syntax, but what you just described sounds a
little too free-form to me. I believe Ruby at least requires a % as part of
the syntax, e.g. %q{test}. But I don't think %R{test} is a good idea for
rust, as it would conflict with the % operator. I don't think other
punctuation would work well either.
-Kevin
On Sep 19, 2013, at 2:10 PM, Martin DeMello martindeme...@gmail.com
wrote:
How complicated would it be to use R but with arbitrary paired
delimiters (the way, for instance, ruby does it)? It's very handy to
pick a delimiter you know does not appear in the string, e.g. if you
had a string containing ')' you could use R{this is a string with a )
in it} or R|this is a string with a ) in it|.
martin
On Thu, Sep 19, 2013 at 1:36 PM, Kevin Ballard ke...@sb.org wrote:
One feature common to many programming languages that Rust lacks is
raw string literals. Specifically, these are string literals that don't
interpret backslash-escapes. There are three obvious applications at the
moment: regular expressions, windows file paths, and format!() strings that
want to embed { and } chars. I'm sure there are more as well, such as large
string literals that contain things like HTML text.
I took a look at 3 programming languages to see what solutions they
had: D, C++11, and Python. I've reproduced their syntax below, plus one
more custom syntax, along with pros cons. I'm hoping we can come up with
a syntax that makes sense for Rust.
## Python syntax:
Python supports an r or R prefix on any string literal (both
short strings, delimited with a single quote, or long strings,
delimited with 3 quotes). The r or R prefix denotes a raw string, and
has the effect of disabling backslash-escapes within the string. For the
most part. It actually gets a bit weird: if a sequence of backslashes of an
odd length occurs prior to a quote (of the appropriate quote type for the
string), then the quote is considered to be escaped, but the backslashes
are left in the string. This means rfoo\ evaluates to the string
`foo\`, and similarly rfoo\\\ is `foo\\\`, but rfoo\\ is merely the
string `foo\\`.
Pros:
* Simple syntax
* Allows for embedding the closing quote character in the raw string
Cons:
* Handling of backslashes is very bizarre, and the closing quote
character can only be embedded if you want to have a backslash before it.
## C++11 syntax:
C++11 allows for raw strings using a sequence of the form Rseq(raw
text)seq. In this construct, `seq` is any sequence of (zero or more)
characters except for: space, (, ), \, \t, \v, \n, \r. The simplest form
looks like R(raw text), which allows for anything in the raw text except
for the sequence `)`. The addition of the delimiter sequence allows for
constructing a raw string containing any sequence at all (as the delimiter
sequence can be adjusted based on the represented text).
Pros:
* Allows for embedding any character at all (representable in the
source file encoding), including the closing quote.
* Reasonably straightforward
Cons:
* Syntax is slightly complicated
## D syntax:
D supports three different forms of raw strings. The first two are
similar, being rraw text and `raw text`. Besides the choice of
delimiters, they behave identically, in that the raw text may contain
anything except for the appropriate quote character. The third syntax is a
slightly more complicated form of C++11's syntax, and is called a delimited
string. It takes two forms.
The first looks like q(raw text) where the ( may