Dňa štvrtok, 23. októbra 2014 16:21:11 UTC+2 Andreas Noack napísal(-a): > > I missed that you were already using the ArrayViews package. Sorry for > that. > > The way you have written the rank one update in line 14 is very expensive. > Unfortunately, we are not making all of BLAS available through our generic > mat-vec interface right now, but the whole of BLAS is wrapped in the BLAS > module, so you can use the ger! function there, i.e. > > function mgs2(X) > # mgs speed project > > nobs, nvars = size(X) > R = eye(nvars) > > for l=1:nvars-1 > v_i = view(X,:,l+1:nvars) # --- line 10 > s = view(X,:,l); > > R[l,l+1:nvars] = (v_i'*s)/sumabs2(s) # --- line 13 > BLAS.ger!(-1.0, s, vec(view(R,l,l+1:nvars)), v_i) # --- line 14 > end > > X, R > > end > > which avoids a lot of allocation. I don't know how Octave handles such > expression, so I don't know why they are faster when not calling into BLAS > directly. > > Med venlig hilsen > > Andreas Noack >
Indeed line #14 is very expensive. I see it now :). Thanks for the BLAS tip. I am just getting familiar with the BLAS and concepts of de-vectorized code. Last five years I used to use Matlab/Octave and vectorization and it turned into a rather strong habit :). I may also try to completely de-vectorize the code. Thanks, Jan
