Re: Poor regex performance?
On Thursday, 4 April 2019 at 10:31:43 UTC, Julian wrote: On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole wrote: If you need performance use ldc not dmd (assumed). LLVM has many factors better code optimizes than dmd does. Thanks! I already had dmd installed from a brief look at D a long time ago, so I missed the details at https://dlang.org/download.html ldc2 -O3 does a lot better, but the result is still 30x slower without PCRE. Try: ldc2 -O3 -release -flto=thin -defaultlib=phobos2-ldc-lto,druntime-ldc-lto -enable-inlining This will improve inlining and optimization across the runtime library boundaries. This can help in certain types of code.
Re: Poor regex performance?
On Thu, Apr 04, 2019 at 09:53:06AM +, Julian via Digitalmars-d-learn wrote: [...] > auto re = ctRegex!(r"(?:\S+ ){3,4}<= ([^@]+@(\S+))"); [...] ctRegex is a crock; use regex() instead and it might actually work better. T -- Stop staring at me like that! It's offens... no, you'll hurt your eyes!
Re: Poor regex performance?
On Thursday, 4 April 2019 at 10:31:43 UTC, Julian wrote: On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole wrote: If you need performance use ldc not dmd (assumed). LLVM has many factors better code optimizes than dmd does. Thanks! I already had dmd installed from a brief look at D a long time ago, so I missed the details at https://dlang.org/download.html ldc2 -O3 does a lot better, but the result is still 30x slower without PCRE. You need to disable the GC. by importing core.memory : GC; and calling GC.Disable(); the next thing is to avoid the .idup and cast to string instead.
Re: Poor regex performance?
On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole wrote: If you need performance use ldc not dmd (assumed). LLVM has many factors better code optimizes than dmd does. Thanks! I already had dmd installed from a brief look at D a long time ago, so I missed the details at https://dlang.org/download.html ldc2 -O3 does a lot better, but the result is still 30x slower without PCRE.
Re: Poor regex performance?
On Thursday, 4 April 2019 at 09:53:06 UTC, Julian wrote: Relatedly, how can I add custom compiler flags to rdmd, in a D script? For example, -L-lpcre Configuration variable "DFLAGS". On Windows you can specify it in the sc.ini file. On Linux: https://dlang.org/dmd-linux.html
Re: Poor regex performance?
If you need performance use ldc not dmd (assumed). LLVM has many factors better code optimizes than dmd does.
Poor regex performance?
The following code, that just runs a regex against a large exim log to report on top senders, is 140 times slower than similar C code using PCRE, when compiled with just -O. With a bunch of other flags I got it down to only 13x slower than C code that's using libc regcomp/regexec. import std.stdio, std.string, std.regex, std.array, std.algorithm; T min(T)(T a, T b) { if (a < b) return a; return b; } void main() { ulong[string] emailcounts; auto re = ctRegex!(r"(?:\S+ ){3,4}<= ([^@]+@(\S+))"); foreach (line; File("exim_mainlog").byLine()) { auto m = line.match(re); if (m) { ++emailcounts[m.front[1].idup]; } } string[] senders = emailcounts.keys; sort!((a, b) { return emailcounts[a] > emailcounts[b]; })(senders); foreach (i; 0 .. min(senders.length, 5)) { writefln("%5s %s", emailcounts[senders[i]], senders[i]); } } Other code's available at https://github.com/jrfondren/topsender-bench I get D down to 1.2x slower with PCRE and getline() I wrote this part of the way through chapter 1 of "The D Programming Language", so my question is mainly: is this a fair result? std.regex is very slow and I should reach for PCRE if regex speed matters? Or is this code severely flawed somehow? I'm using a random production log; not trying to make things difficult. Relatedly, how can I add custom compiler flags to rdmd, in a D script? For example, -L-lpcre