Hi, I've got a performance issue that I really can't seem to understand.
This if the inside of my main loop, I'm trying to fill up my alpha variables
(there's some outer loops over thp,Ap,Bp,tp1)
tmp_1 = zero(Float64)
tmp_2 = zero(Float64)
tmp_3 = zero(Float64)
tmp_4 = zero(Float64)
for thp = 1:50 # line 858:
tmp_3 = 0.0 # line 557:
tmp_4 = rand() # line 557:
for B = 1:40 # line 558:
tmp_2 = 0.0 # line 557:
for A = 1:40 # line 558:
tmp_1 = 0.0 # line 550:
@simd for th = 1:50 # line
551:
@inbounds tmp_1 +=
alpha_t[th,A,B] * tr_1[th,thp]
end
@inbounds tmp_2 += tmp_1 *
tr_2[A,Ap]
end
@inbounds tmp_3 += tmp_2 *
tr_3[B,Bp]
end
@inbounds alpha[thp,Ap,Bp,tp1] =
em_1[thp,Ap,Bp,obs[tp1]] * tmp_3
end
This is pretty slow, it take 3.8s on my computer. The profiler tells me
that the last line takes of lot of time:
The weird thing is that I replace tmp_3 by tmp_4 or rand() time gets down
to 0.004s.
I really don't get what's going on. All my arrays are annotated as Float64.
Any idea ? I'm on version 0.3.2.
Thanks!
