Oh, if you are on 0.3 I am not sure if slice exist. If it does, it is 
really slow.

On Sunday, September 20, 2015 at 11:13:21 PM UTC+2, Kristoffer Carlsson 
wrote:
>
> For me, your latest changes made the time go from 0.13 -> 0.11. It is 
> strange we have so different performances, but then again 0.3 and 0.4 are 
> different beasts.
>
> Adding some calls to slice and another loop gained some perf for me. Can 
> you try:
>
> https://gist.github.com/KristofferC/8a8ff33cb186183eea8d
>
> On Sunday, September 20, 2015 at 9:36:20 PM UTC+2, Daniel Carrera wrote:
>>
>> Just another note:
>>
>> I suspect that the `reshape()` might be the guilty party. I am just 
>> guessing here, but I suspect that the reshape() forces a memory copy, while 
>> a regular slice just creates kind of symlink to the original data. 
>> Furthermore, I suspect that the memory copy would mean that when you try to 
>> read from the newly created variable, you have to fetch it from RAM, 
>> despite the fact that the CPU cache already has a perfectly good copy of 
>> the same data.
>>
>> Cheers,
>> Daniel.
>>
>>
>> On 20 September 2015 at 21:25, Daniel Carrera <[email protected]> wrote:
>>
>>> Whoo hoo! It looks like I got another ~6x or ~7x improvement. Using 
>>> Profile.print() I found that the hottest parts of the code appeared to be 
>>> the if-conditions, such as:
>>>
>>>         if o < b_hist[j,3]
>>>
>>> It occurred to me that this could be due to cache misses, so I rewrote 
>>> the code to store the data more compactly:
>>>
>>> -  b_hist = reshape(sim.b[d, 1:t, i, :], t, 3)
>>> +  b_hist_1 = sim.b[d, 1:t, i, 1]
>>> +  b_hist_3 = sim.b[d, 1:t, i, 3]
>>> ...
>>> -        if o < b_hist[j,3]
>>> +        if o < b_hist_3[j]
>>>
>>>
>>> So, instead of an 3xN array, I store two 1xN arrays with the data I 
>>> actually want. I suspect that the biggest improvement is not that there is 
>>> 1/3 less data, but that the data just gets managed differently. The upshot 
>>> is that now the program runs 208 times faster for me than it did initially. 
>>> For me time execution time went from 45s to 0.2s.
>>>
>>> As always, the code is updated on Github:
>>>
>>> https://github.com/dcarrera/sim
>>>
>>> Cheers,
>>> Daniel.
>>>
>>>
>>>
>>> On 20 September 2015 at 20:51, Seth <[email protected]> wrote:
>>>
>>>> As an interim step, you can also get text profiling information using 
>>>> Profile.print() if the graphics aren't working.
>>>>
>>>> On Sunday, September 20, 2015 at 11:35:35 AM UTC-7, Daniel Carrera 
>>>> wrote:
>>>>>
>>>>> Hmm... ProfileView gives me an error:
>>>>>
>>>>> ERROR: panzoom not defined
>>>>>  in view at 
>>>>> /home/daniel/.julia/v0.3/ProfileView/src/ProfileViewGtk.jl:32
>>>>>  in view at /home/daniel/.julia/v0.3/ProfileView/src/ProfileView.jl:51
>>>>>  in include at ./boot.jl:245
>>>>>  in include_from_node1 at ./loading.jl:128
>>>>> while loading /home/daniel/Projects/optimization/run_sim.jl, in 
>>>>> expression starting on line 55
>>>>>
>>>>> Do I need to update something?
>>>>>
>>>>> Cheers,
>>>>> Daniel.
>>>>>
>>>>> On 20 September 2015 at 20:28, Kristoffer Carlsson <[email protected]
>>>>> > wrote:
>>>>>
>>>>>> https://github.com/timholy/ProfileView.jl is invaluable for 
>>>>>> performance tweaking.
>>>>>>
>>>>>> Are you on 0.4?
>>>>>>
>>>>>> On Sunday, September 20, 2015 at 8:26:08 PM UTC+2, Milan 
>>>>>> Bouchet-Valat wrote:
>>>>>>>
>>>>>>> Le dimanche 20 septembre 2015 à 20:22 +0200, Daniel Carrera a écrit 
>>>>>>> : 
>>>>>>> > 
>>>>>>> > 
>>>>>>> > On 20 September 2015 at 19:43, Kristoffer Carlsson < 
>>>>>>> > [email protected]> wrote: 
>>>>>>> > > Did you run the code twice to not time the JIT compiler? 
>>>>>>> > > 
>>>>>>> > > For me, my version runs in 0.24 and Daniels in 0.34. 
>>>>>>> > > 
>>>>>>> > > Anyway, adding this to Daniels version: 
>>>>>>> > > https://gist.github.com/KristofferC/c19c0ccd867fe44700bd makes 
>>>>>>> it 
>>>>>>> > > run in 0.13 seconds for me. 
>>>>>>> > > 
>>>>>>> > > 
>>>>>>> > 
>>>>>>> > Interesting. For me that change only makes a 10-20% improvement. 
>>>>>>> On 
>>>>>>> > my laptop the program takes about 1.5s which is similar to Adam's. 
>>>>>>> So 
>>>>>>> > I guess we are running on similar hardware and you are probably 
>>>>>>> using 
>>>>>>> > a faster desktop. In any case, I added the change and updated the 
>>>>>>> > repository: 
>>>>>>> > 
>>>>>>> > https://github.com/dcarrera/sim 
>>>>>>> > 
>>>>>>> > Is there a good way to profile Julia code? So I have been 
>>>>>>> profiling 
>>>>>>> > by inserting tic() and toc() lines everywhere. On my computer 
>>>>>>> > @profile seems to do the same thing as @time, so it's kind of 
>>>>>>> useless 
>>>>>>> > if I want to find the hot spots in a program. 
>>>>>>> Sure : 
>>>>>>> http://julia.readthedocs.org/en/latest/manual/profile/ 
>>>>>>>
>>>>>>>
>>>>>>> Regards 
>>>>>>>
>>>>>>
>>>>>
>>>
>>

Reply via email to