Hey all,

I'm currently trying to port my small toy language I invented awhile back in Java to D. However, a main part of my lexical analyzer was regular expression matching, which I've been having issues with in D. The regex expression in question is as follows:

[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]

This works well enough in Java to produce a series of tokens that I could then pass to my parser. But when I tried to port this into D, I almost always get an error when using brackets, braces, or parenthesis. I've tried several different combinations, have looked through the std.regex library reference, have Googled this issue, have tested my regular expression in several online-regex testers (primarily http://regexpal.com/, and http://regexhelper.com/), and have even looked it up in the book, "The D Programming Language" (good book, by the way), yet I still can't get it working right. Here's the code I've been using:

...
auto tempCont = cast(char[])read(location, fileSize);
string contents = cast(string)tempCont;
auto reg = regex("[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]");
auto m = match(contents, reg);
auto token = m.captures
...

When I try to run the code above, I get:
parser.d(64): Error: undefined escape sequence \[
parser.d(64): Error: undefined escape sequence \]

When I remove the escaped characters (turning my regex into
"[ 0-9a-zA-Z.*=+-;()\"\'[]<>,{}^#/\\]"), I get no issues compiling or linking. However, on first run, I get the following error (I cut the error short, full error is pasted http://pastebin.com/vjMhkx4N):

std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942): wrong CodepointSet Pattern with error: `[ 0-9a-zA-Z.*=+-;()"'[]` <--HERE-- `<>,{}^#/\]`

I'm very confused on what to do, and much of the information in the library reference seems to contradict what I'm doing. Any help would greatly appreciated!

Thanks!
~Mr. Appleseed

Additional information:

OS/Compiler information:
Ubuntu 12.10 x64
DMD64 D Compiler v2.061

Compiled with:
dmd main.d parser.d





Reply via email to