It's been quite some time since I've re-visited this project and the video on 
the results from the "Top Fuel" class hasn't been published yet, but [looking 
at the results, especially on the fastest desktop AMD Ryzen 
5950X](https://plummerssoftwarellc.github.io/PrimeView/?rc=100&sc=dt&sd=True), 
it's clear that the speed rankings as follows: Mike Barbar's Rust 
contributions, followed by my Nim contribution, followed by Zig, followed by my 
Haskell contribution, with a bunch of also-ran's. All of the top contenders get 
the last boost in speed from a "dense culling" technique developed by Mike and 
I that, when the composite number bit culling pattern is less than one (large) 
register, culls by large register and only stores that register and the next 
one upon completing the prime value culling sequence for that range, which for 
Rust, Nim, and Zig, is then auto-vectorized by the compiler to produce SIMD 
instructions; the main reason that the Haskell code isn't as fast is that GHC 
Haskell doesn't produce LLVM IR code of a form where the LLVM 
optimizer/compiler can recognize the SIMD patterns so that "best-case" 
optimization is not done for a loss of about 15 percent in performance for that 
fast CPU.

I could likely match or exceed the performance of the Rust contribution by 
changing the build process to the Clang/LLVM back end compiler rather than GCC 
and run the same optimizations in the same order as does Rust - this almost 
certainly would work as I have experimented with it; however, this would then 
mean that the code would run slower on some CPU's, just as the Rust 
contribution does. If I tuned the build process to use the back end most 
appropriate to the specific CPU, it would likely not pass the Race committee.

An extra two percent or so performance hardly seems the effort, especially as 
there is more to coding that just raw speed when it is this close: I encourage 
you to look at the form of the code in each of the top four languages, with the 
Zig code particularly hard to follow with the use of `comptime` notations in 
order to avoid the need to have macros available in the language, and with the 
Haskell code concise put perhaps a bit hard to understand if one isn't a 
functional programmer, although I prefer it.

[This is the link to my Nim code for the 
Race](https://github.com/PlummersSoftwareLLC/Primes/tree/drag-race/PrimeNim/solution_3).

Reply via email to