> The way you have written the rank one update in line 14 is very expensive.
> Unfortunately, we are not making all of BLAS available through our generic
> mat-vec interface right now, but the whole of BLAS is wrapped in the BLAS
> module, so you can use the ger! function there, i.e.
>
> function mgs2(X)
> # mgs speed project
>
> nobs, nvars = size(X)
> R = eye(nvars)
>
> for l=1:nvars-1
> v_i = view(X,:,l+1:nvars) # --- line 10
> s = view(X,:,l);
>
> R[l,l+1:nvars] = (v_i'*s)/sumabs2(s) # --- line 13
> BLAS.ger!(-1.0, s, vec(view(R,l,l+1:nvars)), v_i) # --- line 14
> end
>
> X, R
>
> end
>
> which avoids a lot of allocation. I don't know how Octave handles such
> expression, so I don't know why they are faster when not calling into BLAS
> directly.
>
> Med venlig hilsen
>
> Andreas Noack
>
Thanks for pointing out BLAS.ger! , I was not aware of this function
because it is not listed in Julia manual. Using ger! resulted in a
computing time of about 1 second not mentioning huge memory allocation
saving.
I will play further also with line #13 as this can probably also be turned
into a direct BLAS call.
I assume some mat-vec (or mat-mat) operations like (v_i' * s) are directly
translated into a BLAS call e.g. BLAS.gemv('T',v_i,s), right ? Is there a
way to find out what is being called under the hood ?
Thanks,
Jan