Hi Tim,

On Tuesday, April 28, 2015 at 2:53:45 PM UTC+1, Tim Holy wrote:
>
> Before deciding that the compiler is the answer...profile. Where is the 
> bottleneck? 
>
>
well, the code now runs quite fast (double the time it takes for my Fortran 
version), after following the suggestions made in this thread. Basically 
there is only one function in the code, so the bottleneck has to be there 
:-), but I'm not sure I can do anything else to improve its performance. 
The relevant part of the code is:

const T = zeros(Float64,NX,NY,NZ)
const RHS = zeros(Float64,NX,NY,NZ)

[...]

function main_loop()
         for n = 0:NT-1
          @inbounds for k=2:NZ-1, j=2:NY-1, i=2:NX-1
                  RHS[i,j,k] = dt*A*( 
(T[i-1,j,k]-2*T[i,j,k]+T[i+1,j,k])/dx2  +
                                   (T[i,j-1,k]-2*T[i,j,k]+T[i,j+1,k])/dy2  +
                                   (T[i,j,k-1]-2*T[i,j,k]+T[i,j,k+1])/dz2 )

           end

           @inbounds for k=2:NZ-1, j=2:NY-1, i=2:NX-1
                 T[i,j,k] = T[i,j,k] + RHS[i,j,k]
            end

         end
end

Trying to get Julia compiled with the Intel compilers was just to see if I 
could squeeze a bit more performance out of it, but certainly I would also 
appreciate any suggestions on how to speed up my existing Julia code.

Thanks,
Ángel de Vicente

Reply via email to