Re: Poor regex performance?

2019-04-04 Thread Jon Degenhardt via Digitalmars-d-learn

On Thursday, 4 April 2019 at 10:31:43 UTC, Julian wrote:
On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole 
wrote:

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Thanks! I already had dmd installed from a brief look at D a 
long
time ago, so I missed the details at 
https://dlang.org/download.html


ldc2 -O3 does a lot better, but the result is still 30x slower
without PCRE.


Try:
ldc2 -O3 -release -flto=thin 
-defaultlib=phobos2-ldc-lto,druntime-ldc-lto -enable-inlining


This will improve inlining and optimization across the runtime 
library boundaries. This can help in certain types of code.


Re: Poor regex performance?

2019-04-04 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, Apr 04, 2019 at 09:53:06AM +, Julian via Digitalmars-d-learn wrote:
[...]
>   auto re = ctRegex!(r"(?:\S+ ){3,4}<= ([^@]+@(\S+))");
[...]

ctRegex is a crock; use regex() instead and it might actually work
better.


T

-- 
Stop staring at me like that! It's offens... no, you'll hurt your eyes!


Re: Poor regex performance?

2019-04-04 Thread Stefan Koch via Digitalmars-d-learn

On Thursday, 4 April 2019 at 10:31:43 UTC, Julian wrote:
On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole 
wrote:

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Thanks! I already had dmd installed from a brief look at D a 
long
time ago, so I missed the details at 
https://dlang.org/download.html


ldc2 -O3 does a lot better, but the result is still 30x slower
without PCRE.


You need to disable the GC.
by importing core.memory : GC;
and calling GC.Disable();

the next thing is to avoid the .idup and cast to string instead.



Re: Poor regex performance?

2019-04-04 Thread Julian via Digitalmars-d-learn

On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole wrote:

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Thanks! I already had dmd installed from a brief look at D a long
time ago, so I missed the details at 
https://dlang.org/download.html


ldc2 -O3 does a lot better, but the result is still 30x slower
without PCRE.


Re: Poor regex performance?

2019-04-04 Thread XavierAP via Digitalmars-d-learn

On Thursday, 4 April 2019 at 09:53:06 UTC, Julian wrote:


Relatedly, how can I add custom compiler flags to rdmd, in a D 
script?

For example, -L-lpcre


Configuration variable "DFLAGS". On Windows you can specify it in 
the sc.ini file. On Linux: https://dlang.org/dmd-linux.html


Re: Poor regex performance?

2019-04-04 Thread rikki cattermole via Digitalmars-d-learn

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Poor regex performance?

2019-04-04 Thread Julian via Digitalmars-d-learn
The following code, that just runs a regex against a large exim 
log
to report on top senders, is 140 times slower than similar C code 
using
PCRE, when compiled with just -O. With a bunch of other flags I 
got it
down to only 13x slower than C code that's using libc 
regcomp/regexec.


  import std.stdio, std.string, std.regex, std.array, 
std.algorithm;


  T min(T)(T a, T b) {
  if (a < b) return a;
  return b;
  }

  void main() {
  ulong[string] emailcounts;
  auto re = ctRegex!(r"(?:\S+ ){3,4}<= ([^@]+@(\S+))");

  foreach (line; File("exim_mainlog").byLine()) {
  auto m = line.match(re);
  if (m) {
  ++emailcounts[m.front[1].idup];
  }
  }

  string[] senders = emailcounts.keys;
  sort!((a, b) { return emailcounts[a] > emailcounts[b]; 
})(senders);

  foreach (i; 0 .. min(senders.length, 5)) {
  writefln("%5s %s", emailcounts[senders[i]], 
senders[i]);

  }
  }

Other code's available at 
https://github.com/jrfondren/topsender-bench

I get D down to 1.2x slower with PCRE and getline()

I wrote this part of the way through chapter 1 of "The D 
Programming Language",
so my question is mainly: is this a fair result? std.regex is 
very slow and
I should reach for PCRE if regex speed matters? Or is this code 
severely
flawed somehow? I'm using a random production log; not trying to 
make things

difficult.

Relatedly, how can I add custom compiler flags to rdmd, in a D 
script?

For example, -L-lpcre


Overloads not returning appropriate info. [Field reflunkory]

2019-04-04 Thread Alex via Digitalmars-d-learn
Trying to get parameter info of an overloaded method, it doesn't 
return the names. The same code works fine for a free function.



Id = foo
TypeName =
FullName = main.foo
ModuleName = main
MangledName = _D4main3fooFiKdfZv
Protection = public
Body =
Uses = []
Attributes = [
		sAttributeReflection("attr!string(\"test\", 432)", 
"attr!string"),

sAttributeReflection("4", "int")
]
DType = function
Signature = void(int x, ref double y, float z = 43234.3F)
NumArgs = 3
Linkage = D
ReturnType = void
Parameters = [
		sParameterReflection("x", "int", "void", "void", 
sStorageClass(false, false, false, false, false, false)),
		sParameterReflection("y", "double", "void", "void", 
sStorageClass(false, false, false, true, false, false)),
		sParameterReflection("z", "float", "43234.3F", "float", 
sStorageClass(false, false, false, false, false, false))

]



vs





Methods = [
Id = foo
TypeName = int()
FullName = mModel.cDerived!(int).foo
ModuleName = mModel
MangledName = 6mModel__T8cDerivedTiZQm3foo
Protection = public
Body =
Uses = []
Attributes = [sAttributeReflection("A", "string")]
Signature = int()
NumArgs = 0
Linkage = D
ReturnType = int
Parameters = []
DType = delegate
Overloads = []
,
Id = foo
TypeName = pure nothrow @nogc @safe int(int, inout(float))
FullName = mModel.cDerived!(int).foo
ModuleName = mModel
MangledName = 6mModel__T8cDerivedTiZQm3foo
Protection = public
Body =
Uses = []
Attributes = [sAttributeReflection("A", "string")]
Signature = pure nothrow @nogc @safe int(int, inout(float))
NumArgs = 2
Linkage = D
ReturnType = int
Parameters = [
			sParameterReflection("", "int", "void", "void", 
sStorageClass(false, false, false, false, false, false)),
			sParameterReflection("", "inout(float)", "void", "void", 
sStorageClass(false, false, false, false, false, false))

]
DType = delegate
Overloads = []
,




sParameterReflection("z", "float", "43234.3F", "float", 
sStorageClass(false, false, false, false, false, false))


vs

sParameterReflection("", "int", "void", "void", 
sStorageClass(false, false, false, false, false, false)),


notice that some info is field out but not all.

The code is here:
https://github.com/IncipientDesigns/Dlang_Reflect/blob/master/mReflect.d

The Base, Aggregate, Function, and Method classes are the 
appropriate ones to look at. cFunctionReflection does all the 
work though


[Field reflunkory]

Also, with

cFieldReflection I cannot pass the field as a type on to the base 
class to reduce code duplication. Ideally I should be able to 
pass a field around as a type so that reflection can occur on it. 
This seems like a language design issue but maybe there is a 
trick?