Oh, sure. There isn't _usually_ a 1.618**4 kind of work reduction in play, though. Araq would know, but I doubt the reason NimMainInnner is called through a volatile function pointer is to trick gcc's optimizer, though that is a happy side effect.
To me, this was just a performance mystery that I found curious and thought others might appreciate the answer to. Also, more Nim-relevant might be the [automatic memoization thread](https://forum.nim-lang.org/t/1343) which is a more dramatic optimization aedt might appreciate. Of course, an in-line array or even just the closed-form formula are faster still for Fibonacci, but presumably the point of this benchmark from xyz32/aedt was to stress test recursion in a programming language. They may want to reconsider that benchmark to something with less exponential sensitivity to unrolling tricks - unless, of course, the intent is to measure unrolling tricks.
