Yuuki,

thanks a lot, this was what I was missing!

On Tuesday, April 28, 2015 at 3:49:58 PM UTC+1, Yuuki Soho wrote:
>
> The code allocate only 432 bytes on my computer once I removed all global 
> variables, and it's pretty fast.
>
> Multiplying by the inverse of dx2 ... instead of dividing also make quite 
> a difference, 2-3x.
>
> http://pastebin.com/PSZyLXJX
>

Looking at your code, I realized that the trick was to annotate the 
functions. Without using the "inverse of dx2" trick (since I'm not using it 
in Fortran) I get these timings:

Fortran version: 6.7s   (compiled with gfortran -O3)
Julia version:     7.013900449 seconds (3282172 bytes allocated)

Which is getting VERY close to my baseline. For reference, I created 
pastebins for both versions:

Fortran code: http://pastebin.com/nHn44fBa
Julia code:    http://pastebin.com/Q8uc0maL

Thanks a lot. After this, the next goal is to approach Fortran's MPI 
parallel performance with Julia's parallel performance....

Cheers,
Ángel de Vicente

Reply via email to