Parallel Fibonacci

shirleyquirk Wed, 28 Dec 2022 15:45:18 -0800

all of the difference is in the speed of the recursive `fib` and how well the c 
compiler can optimize that.gcc| goto| setjmp  
---|---|---  
orc| 6s | 3.7s  
refc| 6s | 1.6s  
| clang|   
orc| 21s| 15s  
refc| 21s| 15s  
  
goto exceptions correlate with slow for this kind of code, but that difference 
between 3.6s and 1.6 is mysterious.
    <https://godbolt.org/z/fjM81nscK>


generates the exact same machine code

in reality, gcc unrolls it into a spaghetti monster, and i guess orc code just 
spooks whatever heuristic gcc uses and makes it choose poorly.

adding {.inline.} makes things worse (setjmp: 5s, goto:10s for both refc and 
orc)

`-d:danger (without inline) is great for goto, terrible for setjmp (setjmp 8s, 
goto: <2s for both)`

but here's the table with `fib` declared `{.inline.} and `-d:danger`

gcc| goto| setjmp  
---|---|---  
orc| 0.5s| 8s  
refc| 0.5s| 8s  
| clang|   
orc| 11s| 10s  
refc| 11s| 10s  
  
so somehow gcc can tail-call with goto+inline and you get properly fast results.

Parallel Fibonacci

Reply via email to