Ok, gotcha. On Monday, April 28, 2014, John Travers <[email protected]> wrote:
> No, I edited the output for clarity. The slow down is consistent > regardless of order and amount of warmup. Ivar's fix of grouping into > threes using parenthesis eliminates the problem. > > On Monday, April 28, 2014 3:51:00 PM UTC+2, Kevin Squire wrote: >> >> Please correct me if I'm wrong, but it looks like your first set of >> timings include compilation time, since the amount of memory allocated is >> so high and you run right after using the file. Perhaps you can run it >> again with warmup? >> >> Kevin >> >> On Monday, April 28, 2014, John Travers <[email protected]> wrote: >> >>> You just beat me to it! Thanks! >>> >>> On Monday, April 28, 2014 3:41:36 PM UTC+2, Ivar Nesje wrote: >>>> >>>> Reported issue: https://github.com/JuliaLang/julia/issues/6681 >>>> >>>> kl. 13:56:29 UTC+2 mandag 28. april 2014 skrev Ivar Nesje følgende: >>>>> >>>>> It seems like Jeff was wrong in his statement in >>>>> 32384010f<https://github.com/JuliaLang/julia/commit/32384010fd689e0b6a77ee93b24613fb0bdb008f> >>>>> . >>>>> >>>>> This discussion belongs in an issue on github. Do you want to post it >>>>> there? >>>>> >>>>> You can also fix the problem a little prettier by adding a () around 3 >>>>> of the numbers. >>>>> >>>>> Ivar >>>>> >>>>> kl. 13:38:30 UTC+2 mandag 28. april 2014 skrev John Travers følgende: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I have found some odd performance scaling when summing and scaling >>>>>> more than three complex numbers, see the difference between sum5 and >>>>>> sum5b >>>>>> in this gist: https://gist.github.com/jtravs/11368929 >>>>>> >>>>>> Compare: >>>>>> >>>>>> julia> using testsums >>>>>> julia> dosums(Complex{Float64}) >>>>>> elapsed time: 0.022001424 seconds (28800096 bytes allocated) >>>>>> elapsed time: 0.00194736 seconds (96 bytes allocated) >>>>>> >>>>>> With: >>>>>> >>>>>> julia> dosums(Float64) >>>>>> elapsed time: 0.000664517 seconds (96 bytes allocated) >>>>>> elapsed time: 0.000782516 seconds (96 bytes allocated) >>>>>> >>>>>> It seems that splitting the sum into maximum of three operands >>>>>> greatly speeds up performance for Complex{Float64} whereas it has no >>>>>> significant effect for Float64. Does anyone know why? I often have to sum >>>>>> and scale 5 or more arrays in my codes and it would be unfortunate to >>>>>> have >>>>>> to hand block them into sets of three like in sum5b in the gist. >>>>>> >>>>>>
