Also, I just added a new section http://docs.julialang.org/en/latest/manual/performance-tips/#tools that advertises the available tools for helping you diagnose performance problems.
Without taking the time to look at your code, I'll just add that whenever I see an orders-of-magnitude discrepancy between C/Fortran and Julia, my first instinct is to suspect a type problem. The fact that a vectorized version is a bit faster than one written with loops might also support this diagnosis. Best, --Tim On Wednesday, December 18, 2013 03:26:11 AM Ivar Nesje wrote: > My first suggestion to anyone trying to write fast Julia programs is to > read http://docs.julialang.org/en/latest/manual/performance-tips/, those > are all good tips that I do not think will get obsolete when Julia > improves. It seems to me like you know those points. > > I think you get an important hint from the fact that devectorization does > not matter. To me it seems like the current bottleneck is because you use a > anonymous function instead of a regular function. When I replace "f(" by > "gravity(" i get some improvement and then your devectorisation attempts > makes significant difference. Further you might want to try to reduce the > amount of memory allocated, but that seems to complicate your code quite > much. > > My improvements reduces the timing as follows for 1000 iterations. > ivar@Ivar-ubuntu:~/tmp$ julia doptest.jl > elapsed time: 0.878398771 seconds (513399840 bytes allocated) > ivar@Ivar-ubuntu:~/tmp$ julia dopitest.jl > elapsed time: 0.16916126 seconds (122423840 bytes allocated) > > kl. 11:07:30 UTC+1 onsdag 18. desember 2013 skrev Helge Eichhorn følgende: > > Hi, > > > > I spent the last few days porting the well known > > DOP853<http://www.unige.ch/~hairer/software.html>integrator to Julia. The > > process was quite smooth and I have implemented the core functionality. > > However when I run my reference case, a numerical solution of the two-body > > > > problem, I get the following timings: > > - *Fortran* (gfortran 4.8.2 no optimizations): *~1.7e-5s* > > - *Julia* (master, looped): *~1.3e-3s* > > > > - *Julia* (master, vectorized): > > *~1e-3s (!) * > > > > I have posted the Julia code and the Fortran reference in this > > Gist<https://gist.github.com/helgee/8019521>. The computationally > > expensive part seems to be contained in the *dopcore *or *dopcorevec > > *function, respectively. What I really do not understand is, why the > > vectorized expressions seem to run faster or rather what I am doing wrong > > here. > > > > Any ideas or suggestions? Many thanks in advance! > > Helge
