It is definitely the slicing that is killing performance. Right now,
slicing is expensive since you copy a whole new array for it.
Putting const in front of T and RHS and using a loop like this (maybe some
mistake but the principle is what is important) makes the code 10x faster:
@inbounds for i=2:NX-1, j=2:NY-1, k=2:NZ-1
RHS[i,j,k] = dt*A*( (T[i-1,j,k]-2*T[i,j,k]+T[i+1,j,k])/dx2 +
(T[i,j-1,k]-2*T[i,j,k]+T[i,j+1,k])/dy2 +
(T[i,j,k-1]-2*T[i,j,k]+T[i,j,k+1])/dz2 )
T[i,j,k] = T[i,j,k] + RHS[i,j,k]
end
On Saturday, April 25, 2015 at 7:57:01 PM UTC+2, Johan Sigfrids wrote:
>
> I think it is all the slicing that is killing the performance. Maybe
> something like arrayviews or the new sub stuff on 0.4 would help.
> Alternatively devectorizing into a bunch of nested loops.
>
> On Saturday, April 25, 2015 at 8:42:09 PM UTC+3, Stefan Karpinski wrote:
>>
>> Stick const in front of T and RHS.
>>
>> On Sat, Apr 25, 2015 at 11:32 AM, Tim Holy <[email protected]> wrote:
>>
>>> Did you read through
>>> http://docs.julialang.org/en/release-0.3/manual/performance-tips/? You
>>> should
>>> memorize :-) the sections up through the Tools section; the rest you can
>>> consult as you discover you need them.
>>>
>>> --Tim
>>>
>>> On Saturday, April 25, 2015 01:03:38 AM Ángel de Vicente wrote:
>>> > Hi,
>>> >
>>> > a complete Julia newbie here... I spent a couple of days learning the
>>> > syntax and main aspects of Julia, and since I heard many good things
>>> about
>>> > it, I decided to try a little program to see how it compares against
>>> the
>>> > other ones I regularly use: Fortran and Python.
>>> >
>>> > I wrote a minimal program to solve the 3D heat equation in a cube of
>>> > 100x100x100 points in the three languages and the time it takes to run
>>> in
>>> > each one is:
>>> >
>>> > Fortran: ~7s
>>> > Python: ~33s
>>> > Julia: ~80s
>>> >
>>> > The code runs for 1000 iterations, and I'm being nice to Julia, since
>>> the
>>> > programs in Fortran and Python write 100 HDF5 files with the complete
>>> 100^3
>>> > data (every 10 iterations).
>>> >
>>> > I attach the code (and you can also get it at:
>>> http://pastebin.com/y5HnbWQ1)
>>> >
>>> > Am I doing something obviously wrong? Any suggestions on how to
>>> improve its
>>> > speed?
>>> >
>>> > Thanks a lot,
>>> > Ángel de Vicente
>>>
>>>
>>