I hope to look at this when I get some time, but as a preliminary note, 
merely applying the @inbounds and @simd macros to the main for loop yields 
an increase in performance of about 15-20% on my machine.

Reply via email to