I will preface this by saying I just picked up nim 2.0.2 this week, coming from
python and golang. I started experimenting with the nim.cfg setting to find the
settings that get the fastest perfomance. This is how it looks now
cc = gcc
passC = "-flto -march=native -O4"
passL = "-flto -s"
d=danger
d=useMalloc
d=lto
--- the rest of the file is untouched ---
# additional options always passed to the compiler:
--parallel_build: "0"
....
Run
I have benchmarked 2 scenarios where using the default orc mm is twice as slow
as using markandsweep.
The first script is recursive fibonacci till 50 which looks as follow:
proc fibonacci*(n: int32) : int =
if n <= 1: return n
fibonacci(n-1) + fibonacci(n-2)
discard fibonacci(50)
Run
Benchmarking it with hyperfine using the default mm:
nim\vsgo> nim c --hints:off --o:fibonacciorc.exe fibonacci.nim | hyperfine
fibonacciorc.exe
Benchmark 1: fibonacciorc.exe
Time (mean ± σ): 56.496 s ± 1.153 s [User: 56.143 s, System:
0.025 s]
Range (min … max): 53.893 s … 57.685 s 10 runs
Run
Benchmarking it with markandsweep:
nim\vsgo> nim c --hints:off --mm:markandsweep --o:fibonaccimarkandsweep.exe
fibonacci.nim | hyperfine fibonaccimarkandsweep.exe
Benchmark 1: fibonaccimarkandsweep.exe
Time (mean ± σ): 15.840 s ± 0.141 s [User: 15.797 s, System:
0.013 s]
Range (min … max): 15.675 s … 16.074 s 10 runs
Run
The second benchmark is this year's advent of code day 2, to read a file, parse
the game moves, and calculate some parameters from that input. I modified the
input file which was 100 lines long, copied it till it became 20,000 lines
Same nim commands as above, and used hyperfine on both exes:
nim\aoc23\day02> hyperfine day2markandsweep.exe day2orc.exe --warmup 10
Benchmark 1: day2markandsweep.exe
Time (mean ± σ): 134.8 ms ± 5.7 ms [User: 121.6 ms, System: 13.5
ms]
Range (min … max): 125.8 ms … 146.5 ms 21 runs
Benchmark 2: day2orc.exe
Time (mean ± σ): 176.4 ms ± 3.1 ms [User: 167.1 ms, System: 10.9
ms]
Range (min … max): 172.0 ms … 182.3 ms 16 runs
Summary
'day2markandsweep.exe' ran
1.31 ± 0.06 times faster than 'day2orc.exe'
Run
Finally: when to choose which mm, and what options can I add to nim.cfg to
further improve performance