On Saturday, 16 February 2013 at 20:35:48 UTC, H. S. Teoh wrote:
On Sat, Feb 16, 2013 at 09:22:07PM +0100, MrAppleseed wrote:
Hey all,

I'm currently trying to port my small toy language I invented awhile back in Java to D. However, a main part of my lexical analyzer was regular expression matching, which I've been having issues with in
D. The regex expression in question is as follows:

[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]

This works well enough in Java to produce a series of tokens that I could then pass to my parser. But when I tried to port this into D,
I almost always get an error when using brackets, braces, or
parenthesis. I've tried several different combinations, have looked through the std.regex library reference, have Googled this issue, have tested my regular expression in several online-regex testers (primarily http://regexpal.com/, and http://regexhelper.com/), and have even looked it up in the book, "The D Programming Language" (good book, by the way), yet I still can't get it working right.
Here's the code I've been using:

...
auto tempCont = cast(char[])read(location, fileSize);
string contents = cast(string)tempCont;
auto reg = regex("[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]");

The problem is that you're using D's double-quoted string literal, which adds another level of interpretation to the \'s. What you should do is
to use the backtick string literal, which does *not* interpret
backslashes:

auto reg = regex(`[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]`);

If you have trouble typing `, you can also use r"...", which means the
same thing.

Hope this helps.


--T

Thanks for the quick reply!

I replaced the double-quotes with backticks, compiled it with no problems, but on the first run I got a similar error:

std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942): invalid escape sequence Pattern with error: `[ 0-9a-zA-Z.*=+-;()\"` <--HERE-- `\'[]<>,{}^#/\\]`

After removing the invalid escape sequence, I compiled it, once again with no problems, and attempted to run it, but I got the same error as before:

std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942): wrong CodepointSet Pattern with error: `[ 0-9a-zA-Z.*=+-;()"'[]` <--HERE-- `<>,{}^#/\\]`

(Entire error here: http://pastebin.com/Su9XzbXW)

Reply via email to