One thing I noticed after a quick glance: The ordering of your nested loops is very cache-unfriendly. Julia stores arrays in column-major order (same as Fortran) so that nested loops should arrange that the first subscripts of multidimensional arrays are varied most rapidly.
--Peter On Thursday, December 11, 2014 9:47:33 AM UTC-8, Petr Krysl wrote: > > One more note: I conjectured that perhaps the compiler was not able to > infer correctly the type of the matrices, so I hardwired (in the actual FE > code) > > Jac = 1.0; gradN = gradNparams[j]/(J); # get rid of Rm for the moment > > About 10% less memory used, runtime about the same. So, no effect really. > Loops are still slower than the vectorized code by a factor of two. > > Petr > > >
