@Araq has made [this
benchmark](https://gist.github.com/Araq/e2874e8b218ca8abf6cfc0c1d3bc0f5d), and
I gave it a run to see how `--gc:arc` compares with `-d:useRealtimeGC`.
Here are the results of a slightly modified version of the benchmark, where
each `elapsed` time (in microseconds) was added to `RuningStat` so we could
have more interesting numbers to look at:
nim c -d:useRealtimeGC -d:danger foo.nim
[GC] total memory: 1,504,206,848
[GC] occupied memory: 1,064,654,568
[GC] stack scans: 12
[GC] stack cells: 1
[GC] cycle collections: 0
[GC] max threshold: 0
[GC] zct capacity: 1,008,895
[GC] max cycle table size: 0
[GC] max pause time [ms]: 0
[GC] max stack size: 352
Stats:
max: 1069.0
min: 0.0
mean: 0.367
std deviation: 1.543
skewness: 476.121
0.444s; 1,293,548 KB
Run
nim c --gc:arc -d:danger foo.nim
[GC] total memory: 319,787,008
[GC] occupied memory: 88
Stats:
max: 11.0
min: 0.0
mean: 0.187
std deviation: 0.428
skewness: 2.818
0.250s; 271,204 KB
Run
There are few things to notice:
* `--gc:arc` uses ~5x less memory
* `-d:useRealtimeGC` has a very long tail for elapsed time: look at the value
for skewness (and also the `max` value to get a better feel of the worst-case
scenario), meaning:
* `--gc:arc` is not only faster, but also much more consistent regarding
elapsed time
* * *
Also, here is the output of `perf` for each version:
nim c -d:useRealtimeGC -d:danger foo.nim
Performance counter stats for './foo' (5 runs):
451,31 msec task-clock:u # 0,999 CPUs utilized
( +- 1,18% )
0 context-switches:u # 0,000 K/sec
0 cpu-migrations:u # 0,000 K/sec
335.093 page-faults:u # 0,742 M/sec
( +- 0,00% )
846.263.989 cycles:u # 1,875 GHz
( +- 0,36% )
3.051.847 stalled-cycles-frontend:u # 0,36% frontend
cycles idle ( +- 13,64% )
670.872.510 stalled-cycles-backend:u # 79,27% backend cycles
idle ( +- 0,58% )
682.004.145 instructions:u # 0,81 insn per cycle
# 0,98 stalled cycles
per insn ( +- 0,01% )
106.621.455 branches:u # 236,251 M/sec
( +- 0,01% )
666.133 branch-misses:u # 0,62% of all
branches ( +- 2,37% )
0,45163 +- 0,00531 seconds time elapsed ( +- 1,18% )
Run
nim c --gc:arc -d:danger foo.nim
Performance counter stats for './foo' (5 runs):
249,67 msec task-clock:u # 0,999 CPUs utilized
( +- 1,85% )
0 context-switches:u # 0,000 K/sec
0 cpu-migrations:u # 0,000 K/sec
67.534 page-faults:u # 0,270 M/sec
( +- 0,00% )
776.738.443 cycles:u # 3,111 GHz
( +- 0,20% )
1.165.022 stalled-cycles-frontend:u # 0,15% frontend
cycles idle ( +- 11,06% )
621.229.154 stalled-cycles-backend:u # 79,98% backend cycles
idle ( +- 0,36% )
621.712.808 instructions:u # 0,80 insn per cycle
# 1,00 stalled cycles
per insn ( +- 0,01% )
104.112.444 branches:u # 416,993 M/sec
( +- 0,01% )
87.528 branch-misses:u # 0,08% of all
branches ( +- 3,59% )
0,24992 +- 0,00465 seconds time elapsed ( +- 1,86% )
Run
* * *
Running the same test with `-d:release` instead of `-d:danger` gives different
slowdowns for each version: `-d:useRealtimeGC` is ~2.5x slower, while
`--gc:arc` is ~8x slower.
The exact numbers are [here](http://ix.io/26Ba) and [here](http://ix.io/26Bf).
If somebody has an idea what is causing such a big slowdown for `--gc:arc
-d:release`, please let us know.