On Sunday, 14 October 2018 at 03:26:33 UTC, Adam D. Ruppe wrote:
On Sunday, 14 October 2018 at 03:07:59 UTC, Chris Katko wrote:
For comparison, I just tested and grep uses about 4 MB of RAM
to run.
Running and compiling are two entirely different things.
Running the D regex code should be comparable, but compiling it
is slow, in great part because of internal templates...
There was an effort to speed up the template code, but it is
still not complete.
I know that. I figured people would miss my point on it though so
I should have clarified. That's why I said it's likely the
templates/DMD that's exploding--not the actual regex action.
From a simple program, it takes ~100-150MB of RAM to compile.
Adding a single regex (not compiled regex) balloons to 550MB at 5
seconds of compile time.
-----------
Anyhow, I wrote my own simple "dgrep" and compared the results
with grep, it's very competitive: (NOT to be confused with the
above RAM stats for COMPILING)
Command being timed: "sh -c cat dgrep.d | ./dgrep 'write' "
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 3192
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 301
Voluntary context switches: 5
Involuntary context switches: 124
Swaps: 0
File system inputs: 8
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Command being timed: "sh -c cat dgrep.d | grep 'write'"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2224
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 2
Minor (reclaiming a frame) page faults: 282
Voluntary context switches: 10
Involuntary context switches: 0
Swaps: 0
File system inputs: 760
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
So I have to say I'm impressed with the actual performance of the
regular expressions engine--especially considering "grep" is,
IIRC, considered a fine-tuned beast.