On Saturday, 16 February 2013 at 20:35:48 UTC, H. S. Teoh wrote:
On Sat, Feb 16, 2013 at 09:22:07PM +0100, MrAppleseed wrote:
Hey all,
I'm currently trying to port my small toy language I invented
awhile
back in Java to D. However, a main part of my lexical analyzer
was
regular expression matching, which I've been having issues
with in
D. The regex expression in question is as follows:
[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]
This works well enough in Java to produce a series of tokens
that I
could then pass to my parser. But when I tried to port this
into D,
I almost always get an error when using brackets, braces, or
parenthesis. I've tried several different combinations, have
looked
through the std.regex library reference, have Googled this
issue,
have tested my regular expression in several online-regex
testers
(primarily http://regexpal.com/, and http://regexhelper.com/),
and
have even looked it up in the book, "The D Programming
Language"
(good book, by the way), yet I still can't get it working
right.
Here's the code I've been using:
...
auto tempCont = cast(char[])read(location, fileSize);
string contents = cast(string)tempCont;
auto reg = regex("[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]");
The problem is that you're using D's double-quoted string
literal, which
adds another level of interpretation to the \'s. What you
should do is
to use the backtick string literal, which does *not* interpret
backslashes:
auto reg = regex(`[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]`);
If you have trouble typing `, you can also use r"...", which
means the
same thing.
Hope this helps.
--T
Thanks for the quick reply!
I replaced the double-quotes with backticks, compiled it with no
problems, but on the first run I got a similar error:
std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942):
invalid escape sequence
Pattern with error: `[ 0-9a-zA-Z.*=+-;()\"` <--HERE--
`\'[]<>,{}^#/\\]`
After removing the invalid escape sequence, I compiled it, once
again with no problems, and attempted to run it, but I got the
same error as before:
std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942):
wrong CodepointSet
Pattern with error: `[ 0-9a-zA-Z.*=+-;()"'[]` <--HERE--
`<>,{}^#/\\]`
(Entire error here: http://pastebin.com/Su9XzbXW)