… response inline > On May 2, 2016, at 2:23 PM, John Holdsworth <[email protected] > <mailto:[email protected]>> wrote: > >> >>> I'm having trouble getting the `e` modifier to work as advertised, at least >>> for the sequence `\\`. For example, `print(e"\\\\")` prints two >>> backslashes, and `print(e"\\\")` seems to try to escape the string literal. >>> I'm currently envisioning `e` as disabling *all* backslash escapes, so >>> these behaviors wouldn't be appropriate. It also looks like interpolation >>> is still enabled in `e` strings. >>> >>> Since other things like `print(e"\w+")` work just fine, I'm guessing this >>> is a bug in the proposal's sketches (not being clear enough about the >>> expected behavior), not your code. >>> >>> I've written a gist with some tests to show how I expect things to work: >>> >>> https://gist.github.com/brentdax/be3c032bc7e0c101d7ba8b72cd1a692e >>> <https://gist.github.com/brentdax/be3c032bc7e0c101d7ba8b72cd1a692e> >> The problem here is that I’ve not implemented unescaped literals fully as it >> would require changes outside the lexer. >> This is because the string is first lexed and tokenised by one piece of code >> Lexer::lexStringLiteral but later >> on in the code generation phase it generates the actual literal in a >> function Lexer::getEncodedStringSegment. >> This is passed the same string from the source file but does not know what >> modifiers should be applied. As a result >> normal escapes are still processed. All the “e” flag does is silence the >> error for invalid escapes during tokenising. > > Lexer just lays ropes around certain areas to tell what's where. sometimes > this is not enough for extra semantics. this is the reason why i went down > the path of a custom string_multiline_literal token. It looks like you might > want to consider that path too. If you do, you might consider the merits of > suggesting that half the work be put in place now, allowing both our > experimentations (and other more sophisticated) to lean on it, as an > alternative to just directly adding extra conditional code in the default > lexer code.
Not sure what you mean here. It’s the modifiers that have a greater effect on lexing, not whether a string is multi-line. IMO it’s probably best to avoid creating a separate string_multiline_literal token as that would require visiting the grammar everywhere a string could occur. If you want to see what I mean I’ve committed a change which uses 3 extra bits to the Token structure to carry modifiers applied from the lexing stage to code generation so non-escaping strings can finally be handled correctly. https://github.com/apple/swift/pull/2275 new toolchain: http://johnholdsworth.com/swift-LOCAL-2016-05-04-a-osx.tar.gz <http://johnholdsworth.com/swift-LOCAL-2016-05-04-a-osx.tar.gz> The following now holds assert( e"\w\d+\(author)\n" == "\\w\\d+\\(author)\\n" ); assert( r"\w\d+\(author)\n" == "\\w\\d+\(author)\n" ); // previous implementation John
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
