Mike's modification made the code almost twice as fast on my machine. It 
worked the fastest of all the things I tried. 

I also noticed the compiler generally works faster when you do not have a 
variable outside the scope of a function. For example 
rootof2=sqrt(2)
crandn() = complex128(randn() ,randn() /rootof2)
runs slower than 
const rootof2=sqrt(2)
crandn() = complex128(randn() ,randn() /rootof2)
or
crandn(rootof2) = complex128(randn() ,randn() /rootof2)
.
.
.
        u = F * u + sig * crandn(sqrt(2))

I initially thought that pre calculating sqrt(2) would save the a few  
compute cycles by not have to call the sqrt function at each iteration. It 
seems that compiler does some pretty neat optimization that makes that hand 
tunning attempts futile. 

On Sunday, September 14, 2014 12:06:43 AM UTC+3, Mike Nolta wrote:
>
> `crandn() = complex128(randn(),randn())/sqrt(2.)` should get you even 
> closer to fortran. 
>
> -Mike 
>
>
>
> On Sat, Sep 13, 2014 at 5:01 PM, Leah Hanson <[email protected] 
> <javascript:>> wrote: 
> > Lint.jl is also good for checking that, depending on how much time you 
> want 
> > to spend learning to read the output of code_typed. 
> > 
> > On Sat, Sep 13, 2014 at 3:27 PM, Elliot Saba <[email protected] 
> <javascript:>> wrote: 
> >> 
> >> A good way to track down performance issues like this is to use 
> >> @code_typed to output the typed code in your function and look for 
> places 
> >> where type inference doesn't know what to do; e.g. large type unions, 
> Any 
> >> types, etc....  This is often caused by a variable taking on multiple 
> >> separate types over its lifetime within the function and can cause 
> slowdowns 
> >> inside inner loops. 
> >> -E 
> >> 
> >> On Sat, Sep 13, 2014 at 1:13 PM, Noah Brenowitz <[email protected] 
> <javascript:>> wrote: 
> >>> 
> >>> now i am pretty impressed. 
> >>> 
> >>> On Saturday, September 13, 2014 4:12:07 PM UTC-4, Noah Brenowitz 
> wrote: 
> >>>> 
> >>>> I just replaced u = u0, with u = complex128(u0) in the julia code. 
> Now 
> >>>> it is only 2x as slow as fortran. 
> >> 
> >> 
> > 
>

Reply via email to