Yes, your guess that missing inlining is the problem seems to be true.

We generally use link time optimization, which is really good with gcc10. Try
    
    
     $ nim c -d:release --passC:-flto t.nim
    
    $ ./t
    julia1: 114 ms (sum of pixels: 27677748)
    julia2: 115 ms (sum of pixels: 27677748)
    julia3: 111 ms (sum of pixels: 27677748)
    
    
    Run

There are more options to tweak of course, like ARC or march=native and such.

Reply via email to