@dlesnoff > I believe this article might convince many procedural persons, but I see no > functional programming languages in there.
Well, the Nim version could be turned into something that looks more functional if more variables were defined to be immutable with `let`'s instead of `var`'s and more work were done to avoid mutable global state, which then could be fairly easily translated into a functional language. The question would be which functional language to use: 1. F# wouldn't look all that much different than the above transformed Nim code other than using recursive functions to represent loops and would likely run at about the same speed as the C# code without SIMD instructions other than perhaps the "Hot Loop" could be improved slightly (just as it could perhaps be done in C#) with the advantage over C# in that it is likely to be more concise (just as is the Nim code). I think F#/C# would need to have an adjustment if multi-threading were used in that buffers would need to be pre-allocated per thread on the heap for the "avg_ev" function instead of the stack-based arrays used in C/Nim and then accessing them as indexed by the "threadid". 2. Other somewhat functional languages like Scala are likely to fall into the same speed range as C#/F# above. 3. The only purely functional language that has a chance of coming into the range of speed of C/Nim would be Haskell using mutable unboxed arrays, although I wouldn't expect it to be as fast as the C/Nim code since it doesn't have auto SIMD vectorization and the limited SIMD-based primitives it does have don't seem to fit the use case (no bitwise SIMD operations or SIMD gather operations). The allocation of garbage-collected buffers per call to the "avg_ev" function would also need to be compensated for as for F#/C# as mentioned in point 1 above for multi-threading. Also, I'm not sure that the multi-threading runtime will be able to handle multi-threading as well as the low level C/Nim multi-threading used due to the fineness of the multi-threading time slices used - an average of less than a millisecond to process each "chunk". That said, I might take a crack at it to see where one might end up.