The way that Lua does raw strings is also fairly nifty. Check out http://www.lua.org/manual/5.2/manual.html, section 3.1, or, in short:
- Strings can be delimited by "[===[", with any number of equals signs. The corresponding closing delimiter must match the original number of equals signs. - No escaping is done. - Any kind of end-of-line sequence (i.e. "\r" and "\n" in any order) is converted to just a newline. - It can run for multiple lines. --Andrew D On Thu, Sep 19, 2013 at 10:28 PM, Kevin Cantu <m...@kevincantu.org> wrote: > I think designing good traits to support all these text implementations is > far more important than whatever hungarian notation is preferred for > literals. > > > Kevin > > > On Thu, Sep 19, 2013 at 2:50 PM, Martin DeMello > <martindeme...@gmail.com>wrote: > >> Ah, good point. You could fix it by having a very small whitelist of >> acceptable delimiters, but that probably takes it into overcomplex >> territory. >> >> martin >> >> On Thu, Sep 19, 2013 at 2:46 PM, Kevin Ballard <ke...@sb.org> wrote: >> > As I just responded to Masklinn, this is ambiguous. How do you lex `do >> R{foo()}`? >> > >> > -Kevin >> > >> > On Sep 19, 2013, at 2:41 PM, Martin DeMello <martindeme...@gmail.com> >> wrote: >> > >> >> Yes, I figured R followed by a non-alphabetical character could serve >> >> the same purpose as ruby's %<char>. >> >> >> >> martin >> >> >> >> On Thu, Sep 19, 2013 at 2:37 PM, Kevin Ballard <ke...@sb.org> wrote: >> >>> I didn't look at Ruby's syntax, but what you just described sounds a >> little too free-form to me. I believe Ruby at least requires a % as part of >> the syntax, e.g. %q{test}. But I don't think %R{test} is a good idea for >> rust, as it would conflict with the % operator. I don't think other >> punctuation would work well either. >> >>> >> >>> -Kevin >> >>> >> >>> On Sep 19, 2013, at 2:10 PM, Martin DeMello <martindeme...@gmail.com> >> wrote: >> >>> >> >>>> How complicated would it be to use R"" but with arbitrary paired >> >>>> delimiters (the way, for instance, ruby does it)? It's very handy to >> >>>> pick a delimiter you know does not appear in the string, e.g. if you >> >>>> had a string containing ')' you could use R{this is a string with a ) >> >>>> in it} or R|this is a string with a ) in it|. >> >>>> >> >>>> martin >> >>>> >> >>>> On Thu, Sep 19, 2013 at 1:36 PM, Kevin Ballard <ke...@sb.org> wrote: >> >>>>> One feature common to many programming languages that Rust lacks is >> "raw" string literals. Specifically, these are string literals that don't >> interpret backslash-escapes. There are three obvious applications at the >> moment: regular expressions, windows file paths, and format!() strings that >> want to embed { and } chars. I'm sure there are more as well, such as large >> string literals that contain things like HTML text. >> >>>>> >> >>>>> I took a look at 3 programming languages to see what solutions they >> had: D, C++11, and Python. I've reproduced their syntax below, plus one >> more custom syntax, along with pros & cons. I'm hoping we can come up with >> a syntax that makes sense for Rust. >> >>>>> >> >>>>> ## Python syntax: >> >>>>> >> >>>>> Python supports an "r" or "R" prefix on any string literal (both >> "short" strings, delimited with a single quote, or "long" strings, >> delimited with 3 quotes). The "r" or "R" prefix denotes a "raw string", and >> has the effect of disabling backslash-escapes within the string. For the >> most part. It actually gets a bit weird: if a sequence of backslashes of an >> odd length occurs prior to a quote (of the appropriate quote type for the >> string), then the quote is considered to be escaped, but the backslashes >> are left in the string. This means r"foo\"" evaluates to the string >> `foo\"`, and similarly r"foo\\\"" is `foo\\\"`, but r"foo\\" is merely the >> string `foo\\`. >> >>>>> >> >>>>> Pros: >> >>>>> * Simple syntax >> >>>>> * Allows for embedding the closing quote character in the raw string >> >>>>> >> >>>>> Cons: >> >>>>> * Handling of backslashes is very bizarre, and the closing quote >> character can only be embedded if you want to have a backslash before it. >> >>>>> >> >>>>> ## C++11 syntax: >> >>>>> >> >>>>> C++11 allows for raw strings using a sequence of the form R"seq(raw >> text)seq". In this construct, `seq` is any sequence of (zero or more) >> characters except for: space, (, ), \, \t, \v, \n, \r. The simplest form >> looks like R"(raw text)", which allows for anything in the raw text except >> for the sequence `)"`. The addition of the delimiter sequence allows for >> constructing a raw string containing any sequence at all (as the delimiter >> sequence can be adjusted based on the represented text). >> >>>>> >> >>>>> Pros: >> >>>>> * Allows for embedding any character at all (representable in the >> source file encoding), including the closing quote. >> >>>>> * Reasonably straightforward >> >>>>> >> >>>>> Cons: >> >>>>> * Syntax is slightly complicated >> >>>>> >> >>>>> ## D syntax: >> >>>>> >> >>>>> D supports three different forms of raw strings. The first two are >> similar, being r"raw text" and `raw text`. Besides the choice of >> delimiters, they behave identically, in that the raw text may contain >> anything except for the appropriate quote character. The third syntax is a >> slightly more complicated form of C++11's syntax, and is called a delimited >> string. It takes two forms. >> >>>>> >> >>>>> The first looks like q"(raw text)" where the ( may be any >> non-identifier non-whitespace character. If the character is one of [(<{ >> then it is a "nesting delimiter", and the close delimiter must be the >> matching ])>} character, otherwise the close delimiter is the same as the >> open. Furthermore, nesting delimiters do exactly what their name says: they >> nest. If the nesting delimiter is (), then any ( in the raw text must be >> balanced with a ) in the raw text. In other words, q"(foo(bar))" evaluates >> to "foo(bar)", but q"(foo(bar)" and q"(foobar))" are both illegal. >> >>>>> >> >>>>> The second uses any identifier as the delimiter. In this case, the >> identifier must immediately be followed by a newline, and in order to close >> the string, the close delimiter must be preceded by a newline. This looks >> like >> >>>>> >> >>>>> q"delim >> >>>>> this is some raw text >> >>>>> delim" >> >>>>> >> >>>>> It's essentially a heredoc. Note that the first newline is not part >> of the string, but the final newline is, so this evaluates to "this is some >> raw text\n". >> >>>>> >> >>>>> Pros: >> >>>>> * Flexible >> >>>>> * Allows for constructing a raw string that contains any desired >> sequence of characters (representable in the source file's encoding) >> >>>>> >> >>>>> Cons: >> >>>>> * Overly complicated >> >>>>> >> >>>>> ## Custom syntax >> >>>>> >> >>>>> There's another approach that none of these three languages take, >> which is to merely allow for doubling up the quote character in order to >> embed a quote. This would look like R"raw string literal ""with embedded >> quotes"".", which becomes `raw string literal "with embedded quotes"`. >> >>>>> >> >>>>> Pros: >> >>>>> * Very simple >> >>>>> * Allows for embedding the close quote character, and therefore, >> any character (representable in the source file encoding) >> >>>>> >> >>>>> Cons: >> >>>>> * Slightly odd to read >> >>>>> >> >>>>> ## Conclusion >> >>>>> >> >>>>> Of the three existing syntaxes examined here, I think C++11's is >> the best. It ties with D's syntax for being the most powerful, but is >> simpler than D's. The custom syntax is just as powerful though. The benefit >> of the C++11 syntax over the custom syntax is it's slightly easier to read >> the C++11 syntax, as the raw text has a 1-to-one mapping with the resulting >> string. The custom syntax is a bit more confusing to read, especially if >> you want to add multiple quotes. As a pathological case, let's try >> representing a Python triple-quoted docstring using both syntaxes: >> >>>>> >> >>>>> C++11: R"("""this is a python docstring""")" >> >>>>> Custom: R"""""""this is a python docstring""""""" >> >>>>> >> >>>>> Based on this examination, I'm leaning towards saying Rust should >> support C++11's raw string literal syntax. >> >>>>> >> >>>>> I welcome any comments, criticisms, or suggestions. >> >>>>> >> >>>>> -Kevin >> >>>>> _______________________________________________ >> >>>>> Rust-dev mailing list >> >>>>> Rust-dev@mozilla.org >> >>>>> https://mail.mozilla.org/listinfo/rust-dev >> >>> >> > >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev@mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> > > > _______________________________________________ > Rust-dev mailing list > Rust-dev@mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > >
_______________________________________________ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev