# Re: [julia-users] Re: Any 0.5 performance tips?

`Good detective work, please post your code in the issue!`
```
On Fri, 2016-10-14 at 19:51, Andrew <owen...@gmail.com> wrote:
> I've found the main problem. I have a function which repeatedly accesses a
> 6-dimensional array in a loop. This function took no time in 0.4, but is
> actually very slow in 0.5. My problem looks very similar to 18774
> <https://github.com/JuliaLang/julia/issues/18774>.
> Here's an example:
>
> A3 = rand(10, 10, 10);
> function test3(A, nx1, nx2, nx3)
>   for i = 1:10_000_000
>     A[nx1, nx2, nx3]
>   end
> end
>
> A5 = rand(10, 10, 10, 10, 10);
> function test5(A, nx1, nx2, nx3, nx4, nx5)
>   for i = 1:10_000_000
>     A[nx1, nx2, nx3, nx4, nx5]
>   end
> end
>
> A6 = rand(10, 10, 10, 10, 10, 10);
> function test6(A, nx1, nx2, nx3, nx4, nx5, nx6)
>   for i = 1:10_000_000
>     A[nx1, nx2, nx3, nx4, nx5, nx6]
>   end
> end
> function test6_fast(A, nx1, nx2, nx3, nx4, nx5, nx6)
>   Asize = size(A)
>   for i = 1:10_000_000
>     A[sub2ind(Asize, nx1, nx2, nx3, nx4, nx5, nx6 )]
>   end
> end
> @time test3(A3, 1, 1, 1)
> @time test5(A5, 1, 1, 1, 1, 1)
> @time test6(A6, 1, 1, 1, 1, 1, 1)
> @time test6_fast(A6, 1, 1, 1, 1, 1, 1)
>
>
> test6 takes 0.01s in 0.4 and takes 15s in 0.5. Using a linear index fixes
> the problem.
>
> On Friday, September 30, 2016 at 2:30:02 AM UTC-4, Mauro wrote:
>>
>> On Fri, 2016-09-30 at 03:45, Andrew <owe...@gmail.com <javascript:>>
>> wrote:
>> > I checked, and my objective function is evaluated exactly as many times
>> > under 0.4 as it is under 0.5. The number of iterations must be the same.
>> >
>> > I also looked at the times more precisely. For one particular function
>> call
>> > in the code, I have:
>> >
>> > 0.4 with old code: 6.7s 18.5M allocations
>> > 0.4 with 0.5 style code(regular anonymous functions) 11.6s, 141M
>> > allocations
>> > 0.5: 36.2s, 189M allocations
>> >
>> > Surprisingly, 0.4 is still much faster even without the fast anonymous
>> > functions trick. It doesn't look like 0.5 is generating many more
>> > allocations than 0.4 on the same code, the time is just a lot slower.
>>
>> Sounds like your not far off a minimal, working example.  Post it and
>> I'm sure it will be dissected in no time. (And an issue can be filed).
>>
>> > On Thursday, September 29, 2016 at 3:36:46 PM UTC-4, Tim Holy wrote:
>> >>
>> >> No real clue about what's happening, but my immediate thought was that
>> if
>> >> your algorithm is iterative and uses some kind of threshold to decide
>> >> convergence, then it seems possible that a change in the accuracy of
>> some
>> >> computation might lead to it getting "stuck" occasionally due to
>> roundoff
>> >> error. That's probably more likely to happen because of some kind of
>> >> worsening rather than some improvement, but either is conceivable.
>> >>
>> >> If that's even a possible explanation, I'd check for unusually-large
>> >> numbers of iterations and then print some kind of convergence info.
>> >>
>> >> Best,
>> >> --Tim
>> >>
>> >> On Thu, Sep 29, 2016 at 1:21 PM, Andrew <owe...@gmail.com
>> <javascript:>>
>> >> wrote:
>> >>
>> >>> In the 0.4 version the above times are pretty consistent. I never
>> observe
>> >>> any several thousand allocation calls. I wonder if compilation is
>> occurring
>> >>> repeatedly.
>> >>>
>> >>> This isn't terribly pressing for me since I'm not currently working on
>> >>> this project, but if there's an easy fix it would be useful for future
>> work.
>> >>>
>> >>> (sorry I didn't mean to post twice. For some reason hitting spacebar
>> was
>> >>> interpreted as the post command?)
>> >>>
>> >>>
>> >>> On Thursday, September 29, 2016 at 2:15:35 PM UTC-4, Andrew wrote:
>> >>>>
>> >>>> I've used @code_warntype everywhere I can think to and I've only
>> found
>> >>>> one Core.box. The @code_warntype looks like this
>> >>>>
>> >>>> Variables:
>> >>>>   #self#::#innerloop#3133{#bellman_obj}
>> >>>>   state::State{IdioState,AggState}
>> >>>>   EVspline::Dierckx.Spline1D
>> >>>>   model::Model{CRRA_Family,AggState}
>> >>>>   policy::PolicyFunctions{Array{Float64,6},Array{Int64,6}}
>> >>>>   OO::NW
>> >>>>
>> >>>>
>> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>>
>> >>>>
>> >>>> Body:
>> >>>>   begin
>> >>>>
>> >>>>
>> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>>
>> >>>> = \$(Expr(:new,
>> >>>>
>> ##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj},
>>
>> >>>> :(state), :(EVspline), :(model), :(policy), :(OO),
>> >>>> :((Core.getfield)(#self#,:bellman_obj)::#bellman_obj)))
>> >>>>       SSAValue(0) =
>> >>>>
>> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>>
>> >>>>
>> >>>>
>> (Core.setfield!)((Core.getfield)(#self#::#innerloop#3133{#bellman_obj},:obj)::CORE.BOX,:contents,SSAValue(0))::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>>
>> >>>>       return SSAValue(0)
>> >>>>
>> >>>>
>> end::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>>
>> >>>>
>> >>>>
>> >>>> I put the CORE.BOX in all caps near the bottom.
>> >>>>
>> >>>> I have no idea if this is actually a problem. The return type is
>> stable.
>> >>>> Also, this function appears in an outer loop.
>> >>>>
>> >>>> What I noticed putting a @time in places is that in 0.5, occasionally
>> >>>> calls to my nonlinear equation solver take a really long time, like
>> here:
>> >>>>
>> >>>>   0.069224 seconds (9.62 k allocations: 487.873 KB)
>> >>>>   0.000007 seconds (39 allocations: 1.922 KB)
>> >>>>   0.000006 seconds (29 allocations: 1.391 KB)
>> >>>>   0.000011 seconds (74 allocations: 3.781 KB)
>> >>>>   0.000009 seconds (54 allocations: 2.719 KB)
>> >>>>   0.000008 seconds (54 allocations: 2.719 KB)
>> >>>>   0.000008 seconds (49 allocations: 2.453 KB)
>> >>>>   0.000007 seconds (44 allocations: 2.188 KB)
>> >>>>   0.000007 seconds (44 allocations: 2.188 KB)
>> >>>>   0.000006 seconds (39 allocations: 1.922 KB)
>> >>>>   0.000007 seconds (39 allocations: 1.922 KB)
>> >>>>   0.000006 seconds (39 allocations: 1.922 KB)
>> >>>>   0.000005 seconds (34 allocations: 1.656 KB)
>> >>>>   0.000005 seconds (34 allocations: 1.656 KB)
>> >>>>   0.000004 seconds (29 allocations: 1.391 KB)
>> >>>>   0.000004 seconds (24 allocations: 1.125 KB)
>> >>>>   0.007399 seconds (248 allocations: 15.453 KB)
>> >>>>   0.000009 seconds (30 allocations: 1.594 KB)
>> >>>>   0.000004 seconds (25 allocations: 1.328 KB)
>> >>>>   0.000004 seconds (25 allocations: 1.328 KB)
>> >>>>
>> >>>>   0.000010 seconds (70 allocations: 3.719 KB)
>> >>>>   0.072703 seconds (41.74 k allocations: 1.615 MB)
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thursday, September 29, 2016 at 1:37:18 AM UTC-4, Kristoffer
>> Carlsson
>> >>>> wrote:
>> >>>>>
>> >>>>> Look for Core.Box in @code_warntype. See
>> >>>>> https://github.com/JuliaLang/julia/issues/15276
>> >>>>
>> >>>>
>> >>
>>
```