> The way you have written the rank one update in line 14 is very expensive. 
> Unfortunately, we are not making all of BLAS available through our generic 
> mat-vec interface right now, but the whole of BLAS is wrapped in the BLAS 
> module, so you can use the ger! function there, i.e.
>
>     function mgs2(X)
>     # mgs speed project
>     
>         nobs, nvars = size(X)
>         R = eye(nvars)
>         
>         for l=1:nvars-1
>             v_i = view(X,:,l+1:nvars) # --- line 10
>             s = view(X,:,l);
>           
>             R[l,l+1:nvars] = (v_i'*s)/sumabs2(s) # --- line 13
>             BLAS.ger!(-1.0, s, vec(view(R,l,l+1:nvars)), v_i) # --- line 14
>         end
>         
>         X, R
>     
>     end
>
> which avoids a lot of allocation. I don't know how Octave handles such 
> expression, so I don't know why they are faster when not calling into BLAS 
> directly.
>
> Med venlig hilsen
>
> Andreas Noack
>

Thanks for pointing out BLAS.ger! , I was not aware of this function 
because it is not listed in Julia manual. Using ger! resulted in a 
computing time of about 1 second not mentioning huge memory allocation 
saving. 
I will play further also with line #13 as this can probably also be turned 
into a direct BLAS call.

I assume some mat-vec (or mat-mat) operations like (v_i' * s) are directly 
translated into a BLAS call e.g.  BLAS.gemv('T',v_i,s), right ? Is there a 
way to find out what is being called under the hood ?

Thanks,
Jan

Reply via email to