On Friday, 24 April 2015 at 18:28:16 UTC, Guillaume wrote:
Hello, I'm trying to make a regex comparison with D, based off of this article: https://swtch.com/~rsc/regexp/regexp1.html

I've written my code like so:

import std.stdio, std.regex;

void main(string argv[]) {

        string m = argv[1];
auto p = ctRegex!("a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
        if (match(m, p)) {
                writeln("match");
        } else {
                writeln("no match");
        }

}

And the compiler goes into swap. Doing it at runtime is no better. I was under the impression that this particular regex was used for showcasing the Thompson NFA which D claims to be using.

The regex "a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" can be simplified to "a{30,60}" (if i counted correctly).

The regex "a{30,60}" works fine.

[Speculation]
I don't have a good understanding of how D's regex engine work but I am guessing that it does not do any simplification of the regex input causing it to generate larger engines for each additional ? symbol. Thus needing more memory. Eventually as in this case the compiler runs out of memory.




Reply via email to