I just tried it. You are right. Using slice() makes the code 9x slower for
me.

On 20 September 2015 at 23:14, Kristoffer Carlsson <[email protected]>
wrote:

> Oh, if you are on 0.3 I am not sure if slice exist. If it does, it is
> really slow.
>
> On Sunday, September 20, 2015 at 11:13:21 PM UTC+2, Kristoffer Carlsson
> wrote:
>>
>> For me, your latest changes made the time go from 0.13 -> 0.11. It is
>> strange we have so different performances, but then again 0.3 and 0.4 are
>> different beasts.
>>
>> Adding some calls to slice and another loop gained some perf for me. Can
>> you try:
>>
>> https://gist.github.com/KristofferC/8a8ff33cb186183eea8d
>>
>> On Sunday, September 20, 2015 at 9:36:20 PM UTC+2, Daniel Carrera wrote:
>>>
>>> Just another note:
>>>
>>> I suspect that the `reshape()` might be the guilty party. I am just
>>> guessing here, but I suspect that the reshape() forces a memory copy, while
>>> a regular slice just creates kind of symlink to the original data.
>>> Furthermore, I suspect that the memory copy would mean that when you try to
>>> read from the newly created variable, you have to fetch it from RAM,
>>> despite the fact that the CPU cache already has a perfectly good copy of
>>> the same data.
>>>
>>> Cheers,
>>> Daniel.
>>>
>>>
>>> On 20 September 2015 at 21:25, Daniel Carrera <[email protected]> wrote:
>>>
>>>> Whoo hoo! It looks like I got another ~6x or ~7x improvement. Using
>>>> Profile.print() I found that the hottest parts of the code appeared to be
>>>> the if-conditions, such as:
>>>>
>>>>         if o < b_hist[j,3]
>>>>
>>>> It occurred to me that this could be due to cache misses, so I rewrote
>>>> the code to store the data more compactly:
>>>>
>>>> -  b_hist = reshape(sim.b[d, 1:t, i, :], t, 3)
>>>> +  b_hist_1 = sim.b[d, 1:t, i, 1]
>>>> +  b_hist_3 = sim.b[d, 1:t, i, 3]
>>>> ...
>>>> -        if o < b_hist[j,3]
>>>> +        if o < b_hist_3[j]
>>>>
>>>>
>>>> So, instead of an 3xN array, I store two 1xN arrays with the data I
>>>> actually want. I suspect that the biggest improvement is not that there is
>>>> 1/3 less data, but that the data just gets managed differently. The upshot
>>>> is that now the program runs 208 times faster for me than it did initially.
>>>> For me time execution time went from 45s to 0.2s.
>>>>
>>>> As always, the code is updated on Github:
>>>>
>>>> https://github.com/dcarrera/sim
>>>>
>>>> Cheers,
>>>> Daniel.
>>>>
>>>>
>>>>
>>>> On 20 September 2015 at 20:51, Seth <[email protected]> wrote:
>>>>
>>>>> As an interim step, you can also get text profiling information using
>>>>> Profile.print() if the graphics aren't working.
>>>>>
>>>>> On Sunday, September 20, 2015 at 11:35:35 AM UTC-7, Daniel Carrera
>>>>> wrote:
>>>>>>
>>>>>> Hmm... ProfileView gives me an error:
>>>>>>
>>>>>> ERROR: panzoom not defined
>>>>>>  in view at
>>>>>> /home/daniel/.julia/v0.3/ProfileView/src/ProfileViewGtk.jl:32
>>>>>>  in view at /home/daniel/.julia/v0.3/ProfileView/src/ProfileView.jl:51
>>>>>>  in include at ./boot.jl:245
>>>>>>  in include_from_node1 at ./loading.jl:128
>>>>>> while loading /home/daniel/Projects/optimization/run_sim.jl, in
>>>>>> expression starting on line 55
>>>>>>
>>>>>> Do I need to update something?
>>>>>>
>>>>>> Cheers,
>>>>>> Daniel.
>>>>>>
>>>>>> On 20 September 2015 at 20:28, Kristoffer Carlsson <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> https://github.com/timholy/ProfileView.jl is invaluable for
>>>>>>> performance tweaking.
>>>>>>>
>>>>>>> Are you on 0.4?
>>>>>>>
>>>>>>> On Sunday, September 20, 2015 at 8:26:08 PM UTC+2, Milan
>>>>>>> Bouchet-Valat wrote:
>>>>>>>>
>>>>>>>> Le dimanche 20 septembre 2015 à 20:22 +0200, Daniel Carrera a écrit
>>>>>>>> :
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On 20 September 2015 at 19:43, Kristoffer Carlsson <
>>>>>>>> > [email protected]> wrote:
>>>>>>>> > > Did you run the code twice to not time the JIT compiler?
>>>>>>>> > >
>>>>>>>> > > For me, my version runs in 0.24 and Daniels in 0.34.
>>>>>>>> > >
>>>>>>>> > > Anyway, adding this to Daniels version:
>>>>>>>> > > https://gist.github.com/KristofferC/c19c0ccd867fe44700bd makes
>>>>>>>> it
>>>>>>>> > > run in 0.13 seconds for me.
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> >
>>>>>>>> > Interesting. For me that change only makes a 10-20% improvement.
>>>>>>>> On
>>>>>>>> > my laptop the program takes about 1.5s which is similar to
>>>>>>>> Adam's. So
>>>>>>>> > I guess we are running on similar hardware and you are probably
>>>>>>>> using
>>>>>>>> > a faster desktop. In any case, I added the change and updated the
>>>>>>>> > repository:
>>>>>>>> >
>>>>>>>> > https://github.com/dcarrera/sim
>>>>>>>> >
>>>>>>>> > Is there a good way to profile Julia code? So I have been
>>>>>>>> profiling
>>>>>>>> > by inserting tic() and toc() lines everywhere. On my computer
>>>>>>>> > @profile seems to do the same thing as @time, so it's kind of
>>>>>>>> useless
>>>>>>>> > if I want to find the hot spots in a program.
>>>>>>>> Sure :
>>>>>>>> http://julia.readthedocs.org/en/latest/manual/profile/
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>

Reply via email to