Adding your loop increased performance by another 20%. Good stuff!
Changes uploaded to Github.

Cheers,
Daniel.

On 20 September 2015 at 23:13, Kristoffer Carlsson <[email protected]>
wrote:

> For me, your latest changes made the time go from 0.13 -> 0.11. It is
> strange we have so different performances, but then again 0.3 and 0.4 are
> different beasts.
>
> Adding some calls to slice and another loop gained some perf for me. Can
> you try:
>
> https://gist.github.com/KristofferC/8a8ff33cb186183eea8d
>
> On Sunday, September 20, 2015 at 9:36:20 PM UTC+2, Daniel Carrera wrote:
>>
>> Just another note:
>>
>> I suspect that the `reshape()` might be the guilty party. I am just
>> guessing here, but I suspect that the reshape() forces a memory copy, while
>> a regular slice just creates kind of symlink to the original data.
>> Furthermore, I suspect that the memory copy would mean that when you try to
>> read from the newly created variable, you have to fetch it from RAM,
>> despite the fact that the CPU cache already has a perfectly good copy of
>> the same data.
>>
>> Cheers,
>> Daniel.
>>
>>
>> On 20 September 2015 at 21:25, Daniel Carrera <[email protected]> wrote:
>>
>>> Whoo hoo! It looks like I got another ~6x or ~7x improvement. Using
>>> Profile.print() I found that the hottest parts of the code appeared to be
>>> the if-conditions, such as:
>>>
>>>         if o < b_hist[j,3]
>>>
>>> It occurred to me that this could be due to cache misses, so I rewrote
>>> the code to store the data more compactly:
>>>
>>> -  b_hist = reshape(sim.b[d, 1:t, i, :], t, 3)
>>> +  b_hist_1 = sim.b[d, 1:t, i, 1]
>>> +  b_hist_3 = sim.b[d, 1:t, i, 3]
>>> ...
>>> -        if o < b_hist[j,3]
>>> +        if o < b_hist_3[j]
>>>
>>>
>>> So, instead of an 3xN array, I store two 1xN arrays with the data I
>>> actually want. I suspect that the biggest improvement is not that there is
>>> 1/3 less data, but that the data just gets managed differently. The upshot
>>> is that now the program runs 208 times faster for me than it did initially.
>>> For me time execution time went from 45s to 0.2s.
>>>
>>> As always, the code is updated on Github:
>>>
>>> https://github.com/dcarrera/sim
>>>
>>> Cheers,
>>> Daniel.
>>>
>>>
>>>
>>> On 20 September 2015 at 20:51, Seth <[email protected]> wrote:
>>>
>>>> As an interim step, you can also get text profiling information using
>>>> Profile.print() if the graphics aren't working.
>>>>
>>>> On Sunday, September 20, 2015 at 11:35:35 AM UTC-7, Daniel Carrera
>>>> wrote:
>>>>>
>>>>> Hmm... ProfileView gives me an error:
>>>>>
>>>>> ERROR: panzoom not defined
>>>>>  in view at
>>>>> /home/daniel/.julia/v0.3/ProfileView/src/ProfileViewGtk.jl:32
>>>>>  in view at /home/daniel/.julia/v0.3/ProfileView/src/ProfileView.jl:51
>>>>>  in include at ./boot.jl:245
>>>>>  in include_from_node1 at ./loading.jl:128
>>>>> while loading /home/daniel/Projects/optimization/run_sim.jl, in
>>>>> expression starting on line 55
>>>>>
>>>>> Do I need to update something?
>>>>>
>>>>> Cheers,
>>>>> Daniel.
>>>>>
>>>>> On 20 September 2015 at 20:28, Kristoffer Carlsson <[email protected]
>>>>> > wrote:
>>>>>
>>>>>> https://github.com/timholy/ProfileView.jl is invaluable for
>>>>>> performance tweaking.
>>>>>>
>>>>>> Are you on 0.4?
>>>>>>
>>>>>> On Sunday, September 20, 2015 at 8:26:08 PM UTC+2, Milan
>>>>>> Bouchet-Valat wrote:
>>>>>>>
>>>>>>> Le dimanche 20 septembre 2015 à 20:22 +0200, Daniel Carrera a écrit
>>>>>>> :
>>>>>>> >
>>>>>>> >
>>>>>>> > On 20 September 2015 at 19:43, Kristoffer Carlsson <
>>>>>>> > [email protected]> wrote:
>>>>>>> > > Did you run the code twice to not time the JIT compiler?
>>>>>>> > >
>>>>>>> > > For me, my version runs in 0.24 and Daniels in 0.34.
>>>>>>> > >
>>>>>>> > > Anyway, adding this to Daniels version:
>>>>>>> > > https://gist.github.com/KristofferC/c19c0ccd867fe44700bd makes
>>>>>>> it
>>>>>>> > > run in 0.13 seconds for me.
>>>>>>> > >
>>>>>>> > >
>>>>>>> >
>>>>>>> > Interesting. For me that change only makes a 10-20% improvement.
>>>>>>> On
>>>>>>> > my laptop the program takes about 1.5s which is similar to Adam's.
>>>>>>> So
>>>>>>> > I guess we are running on similar hardware and you are probably
>>>>>>> using
>>>>>>> > a faster desktop. In any case, I added the change and updated the
>>>>>>> > repository:
>>>>>>> >
>>>>>>> > https://github.com/dcarrera/sim
>>>>>>> >
>>>>>>> > Is there a good way to profile Julia code? So I have been
>>>>>>> profiling
>>>>>>> > by inserting tic() and toc() lines everywhere. On my computer
>>>>>>> > @profile seems to do the same thing as @time, so it's kind of
>>>>>>> useless
>>>>>>> > if I want to find the hot spots in a program.
>>>>>>> Sure :
>>>>>>> http://julia.readthedocs.org/en/latest/manual/profile/
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>
>>>>>
>>>
>>

Reply via email to