I will preface this by saying I just picked up nim 2.0.2 this week, coming from 
python and golang. I started experimenting with the nim.cfg setting to find the 
settings that get the fastest perfomance. This is how it looks now
    
    
    cc = gcc
    passC = "-flto -march=native -O4"
    passL = "-flto -s"
    d=danger
    d=useMalloc
    d=lto
    
    ---  the rest of the file is untouched ---
    # additional options always passed to the compiler:
    --parallel_build: "0"
    ....
    
    
    Run

I have benchmarked 2 scenarios where using the default orc mm is twice as slow 
as using markandsweep.

The first script is recursive fibonacci till 50 which looks as follow:
    
    
    proc fibonacci*(n: int32) : int =
        if n <= 1: return n
        fibonacci(n-1) + fibonacci(n-2)
    discard fibonacci(50)
    
    
    Run

Benchmarking it with hyperfine using the default mm:
    
    
    nim\vsgo> nim c --hints:off --o:fibonacciorc.exe fibonacci.nim | hyperfine 
fibonacciorc.exe
    Benchmark 1: fibonacciorc.exe
      Time (mean ± σ):     56.496 s ±  1.153 s    [User: 56.143 s, System: 
0.025 s]
      Range (min … max):   53.893 s … 57.685 s    10 runs
    
    
    Run

Benchmarking it with markandsweep:
    
    
    nim\vsgo> nim c --hints:off --mm:markandsweep --o:fibonaccimarkandsweep.exe 
fibonacci.nim | hyperfine fibonaccimarkandsweep.exe
    Benchmark 1: fibonaccimarkandsweep.exe
      Time (mean ± σ):     15.840 s ±  0.141 s    [User: 15.797 s, System: 
0.013 s]
      Range (min … max):   15.675 s … 16.074 s    10 runs
    
    
    Run

The second benchmark is this year's advent of code day 2, to read a file, parse 
the game moves, and calculate some parameters from that input. I modified the 
input file which was 100 lines long, copied it till it became 20,000 lines

Same nim commands as above, and used hyperfine on both exes:
    
    
    nim\aoc23\day02> hyperfine day2markandsweep.exe day2orc.exe --warmup 10
    Benchmark 1: day2markandsweep.exe
      Time (mean ± σ):     134.8 ms ±   5.7 ms    [User: 121.6 ms, System: 13.5 
ms]
      Range (min … max):   125.8 ms … 146.5 ms    21 runs
    
    Benchmark 2: day2orc.exe
      Time (mean ± σ):     176.4 ms ±   3.1 ms    [User: 167.1 ms, System: 10.9 
ms]
      Range (min … max):   172.0 ms … 182.3 ms    16 runs
    
    Summary
      'day2markandsweep.exe' ran
        1.31 ± 0.06 times faster than 'day2orc.exe'
    
    
    Run

Finally: when to choose which mm, and what options can I add to nim.cfg to 
further improve performance

Reply via email to