I just tried it. You are right. Using slice() makes the code 9x slower for me.
On 20 September 2015 at 23:14, Kristoffer Carlsson <[email protected]> wrote: > Oh, if you are on 0.3 I am not sure if slice exist. If it does, it is > really slow. > > On Sunday, September 20, 2015 at 11:13:21 PM UTC+2, Kristoffer Carlsson > wrote: >> >> For me, your latest changes made the time go from 0.13 -> 0.11. It is >> strange we have so different performances, but then again 0.3 and 0.4 are >> different beasts. >> >> Adding some calls to slice and another loop gained some perf for me. Can >> you try: >> >> https://gist.github.com/KristofferC/8a8ff33cb186183eea8d >> >> On Sunday, September 20, 2015 at 9:36:20 PM UTC+2, Daniel Carrera wrote: >>> >>> Just another note: >>> >>> I suspect that the `reshape()` might be the guilty party. I am just >>> guessing here, but I suspect that the reshape() forces a memory copy, while >>> a regular slice just creates kind of symlink to the original data. >>> Furthermore, I suspect that the memory copy would mean that when you try to >>> read from the newly created variable, you have to fetch it from RAM, >>> despite the fact that the CPU cache already has a perfectly good copy of >>> the same data. >>> >>> Cheers, >>> Daniel. >>> >>> >>> On 20 September 2015 at 21:25, Daniel Carrera <[email protected]> wrote: >>> >>>> Whoo hoo! It looks like I got another ~6x or ~7x improvement. Using >>>> Profile.print() I found that the hottest parts of the code appeared to be >>>> the if-conditions, such as: >>>> >>>> if o < b_hist[j,3] >>>> >>>> It occurred to me that this could be due to cache misses, so I rewrote >>>> the code to store the data more compactly: >>>> >>>> - b_hist = reshape(sim.b[d, 1:t, i, :], t, 3) >>>> + b_hist_1 = sim.b[d, 1:t, i, 1] >>>> + b_hist_3 = sim.b[d, 1:t, i, 3] >>>> ... >>>> - if o < b_hist[j,3] >>>> + if o < b_hist_3[j] >>>> >>>> >>>> So, instead of an 3xN array, I store two 1xN arrays with the data I >>>> actually want. I suspect that the biggest improvement is not that there is >>>> 1/3 less data, but that the data just gets managed differently. The upshot >>>> is that now the program runs 208 times faster for me than it did initially. >>>> For me time execution time went from 45s to 0.2s. >>>> >>>> As always, the code is updated on Github: >>>> >>>> https://github.com/dcarrera/sim >>>> >>>> Cheers, >>>> Daniel. >>>> >>>> >>>> >>>> On 20 September 2015 at 20:51, Seth <[email protected]> wrote: >>>> >>>>> As an interim step, you can also get text profiling information using >>>>> Profile.print() if the graphics aren't working. >>>>> >>>>> On Sunday, September 20, 2015 at 11:35:35 AM UTC-7, Daniel Carrera >>>>> wrote: >>>>>> >>>>>> Hmm... ProfileView gives me an error: >>>>>> >>>>>> ERROR: panzoom not defined >>>>>> in view at >>>>>> /home/daniel/.julia/v0.3/ProfileView/src/ProfileViewGtk.jl:32 >>>>>> in view at /home/daniel/.julia/v0.3/ProfileView/src/ProfileView.jl:51 >>>>>> in include at ./boot.jl:245 >>>>>> in include_from_node1 at ./loading.jl:128 >>>>>> while loading /home/daniel/Projects/optimization/run_sim.jl, in >>>>>> expression starting on line 55 >>>>>> >>>>>> Do I need to update something? >>>>>> >>>>>> Cheers, >>>>>> Daniel. >>>>>> >>>>>> On 20 September 2015 at 20:28, Kristoffer Carlsson < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> https://github.com/timholy/ProfileView.jl is invaluable for >>>>>>> performance tweaking. >>>>>>> >>>>>>> Are you on 0.4? >>>>>>> >>>>>>> On Sunday, September 20, 2015 at 8:26:08 PM UTC+2, Milan >>>>>>> Bouchet-Valat wrote: >>>>>>>> >>>>>>>> Le dimanche 20 septembre 2015 à 20:22 +0200, Daniel Carrera a écrit >>>>>>>> : >>>>>>>> > >>>>>>>> > >>>>>>>> > On 20 September 2015 at 19:43, Kristoffer Carlsson < >>>>>>>> > [email protected]> wrote: >>>>>>>> > > Did you run the code twice to not time the JIT compiler? >>>>>>>> > > >>>>>>>> > > For me, my version runs in 0.24 and Daniels in 0.34. >>>>>>>> > > >>>>>>>> > > Anyway, adding this to Daniels version: >>>>>>>> > > https://gist.github.com/KristofferC/c19c0ccd867fe44700bd makes >>>>>>>> it >>>>>>>> > > run in 0.13 seconds for me. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > >>>>>>>> > Interesting. For me that change only makes a 10-20% improvement. >>>>>>>> On >>>>>>>> > my laptop the program takes about 1.5s which is similar to >>>>>>>> Adam's. So >>>>>>>> > I guess we are running on similar hardware and you are probably >>>>>>>> using >>>>>>>> > a faster desktop. In any case, I added the change and updated the >>>>>>>> > repository: >>>>>>>> > >>>>>>>> > https://github.com/dcarrera/sim >>>>>>>> > >>>>>>>> > Is there a good way to profile Julia code? So I have been >>>>>>>> profiling >>>>>>>> > by inserting tic() and toc() lines everywhere. On my computer >>>>>>>> > @profile seems to do the same thing as @time, so it's kind of >>>>>>>> useless >>>>>>>> > if I want to find the hot spots in a program. >>>>>>>> Sure : >>>>>>>> http://julia.readthedocs.org/en/latest/manual/profile/ >>>>>>>> >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>> >>>>>> >>>> >>>
