Hi all,

I have found some odd performance scaling when summing and scaling more 
than three complex numbers, see the difference between sum5 and sum5b in 
this gist: https://gist.github.com/jtravs/11368929

Compare:

julia> using testsums
julia> dosums(Complex{Float64}) 
elapsed time: 0.022001424 seconds (28800096 bytes allocated) 
elapsed time: 0.00194736 seconds (96 bytes allocated)

With:

julia> dosums(Float64)
elapsed time: 0.000664517 seconds (96 bytes allocated)
elapsed time: 0.000782516 seconds (96 bytes allocated)

It seems that splitting the sum into maximum of three operands greatly 
speeds up performance for Complex{Float64} whereas it has no significant 
effect for Float64. Does anyone know why? I often have to sum and scale 5 
or more arrays in my codes and it would be unfortunate to have to hand 
block them into sets of three like in sum5b in the gist.

Reply via email to