Hi, I've got a performance issue that I really can't seem to understand.


This if the inside of my main loop, I'm trying to fill up my alpha variables 
(there's some outer loops over thp,Ap,Bp,tp1)


                                    tmp_1 = zero(Float64)

                                    tmp_2 = zero(Float64)

                                    tmp_3 = zero(Float64)

                                    tmp_4 = zero(Float64)

                                

                                    for thp = 1:50 # line 858:

                                        tmp_3 = 0.0 # line 557:

                                        tmp_4 = rand() # line 557:

                                                                            
                                                    

                                        for B = 1:40 # line 558:

 

                                            tmp_2 = 0.0 # line 557:

                                            for A = 1:40 # line 558:

 

                                                tmp_1 = 0.0 # line 550:

                                                @simd for th = 1:50 # line 
551:

 

                                                    @inbounds tmp_1 += 
alpha_t[th,A,B] * tr_1[th,thp]                                             
           

                                                end

 

                                                @inbounds tmp_2 += tmp_1 * 
tr_2[A,Ap]

                                            end

                                            

                                            @inbounds tmp_3 += tmp_2 * 
tr_3[B,Bp]

                                        end

 

                                        @inbounds alpha[thp,Ap,Bp,tp1] =  
em_1[thp,Ap,Bp,obs[tp1]] * tmp_3                 

                                    end


This is pretty slow, it take 3.8s on my computer. The profiler tells me 
that the last line takes of lot of time:


    
The weird thing is that I replace tmp_3 by tmp_4 or rand() time gets down 
to 0.004s. 

I really don't get what's going on. All my arrays are annotated as Float64. 
Any idea ? I'm on version 0.3.2.

Thanks!

Reply via email to