Kernel Matrix Calculation in Nim

doofenstein Wed, 11 Nov 2020 13:00:24 -0800

> For example my SIMD definition for SSE2 and AVX512 mtrix multiplication which 
> allows me, in thousand of lines of Nim code to be as fast as 50x more pure 
> assembly lines in OpenBLAS


Can you please elaborate on that? I looked into your code and since you're 
using intrinsics you're dependant on the mercy of the compiler to schedule 
everything right, is the assembler code you're talking about worse than what 
the compiler archieve? Or do you use some faster algorithm to perform the 
matrix multiplication?

Kernel Matrix Calculation in Nim

Reply via email to