Dňa štvrtok, 23. októbra 2014 16:21:11 UTC+2 Andreas Noack napísal(-a):
>
> I missed that you were already using the ArrayViews package. Sorry for 
> that. 
>
> The way you have written the rank one update in line 14 is very expensive. 
> Unfortunately, we are not making all of BLAS available through our generic 
> mat-vec interface right now, but the whole of BLAS is wrapped in the BLAS 
> module, so you can use the ger! function there, i.e.
>
>     function mgs2(X)
>     # mgs speed project
>     
>         nobs, nvars = size(X)
>         R = eye(nvars)
>         
>         for l=1:nvars-1
>             v_i = view(X,:,l+1:nvars) # --- line 10
>             s = view(X,:,l);
>           
>             R[l,l+1:nvars] = (v_i'*s)/sumabs2(s) # --- line 13
>             BLAS.ger!(-1.0, s, vec(view(R,l,l+1:nvars)), v_i) # --- line 14
>         end
>         
>         X, R
>     
>     end
>
> which avoids a lot of allocation. I don't know how Octave handles such 
> expression, so I don't know why they are faster when not calling into BLAS 
> directly.
>
> Med venlig hilsen
>
> Andreas Noack
>

Indeed line #14 is very expensive. I see it now :). Thanks for the BLAS 
tip. I am just getting familiar with the BLAS and concepts of de-vectorized 
code. Last five years I used  to use Matlab/Octave and vectorization and it 
turned into a rather strong habit :). I may also try to completely 
de-vectorize the code.

Thanks,
Jan 

Reply via email to