You can squeeze a tiny bit extra out of the implementation by using the
inplace gemv!, i.e.
function mgs2(X)
# mgs speed project
nobs, nvars = size(X)
R = eye(nvars)
for l=1:nvars-1
v_i = view(X,:,l+1:nvars)
s = view(X,:,l);
r = vec(view(R,l,l+1:nvars))
BLAS.gemv!('T', 1.0/sumabs2(s), v_i, s, 0.0, r) # --- line 13
BLAS.ger!(-1.0, s, r, v_i) # --- line 14
end
X, R
end
Med venlig hilsen
Andreas Noack
2014-10-24 8:15 GMT-04:00 Ján Dolinský <[email protected]>:
> Writting line #13 as a BLAS call reduced further memory allocation. Thanks
> for all the tips (BLAS, NumericExtensions, InplaceOps, etc.).
>
> using ArrayViews
>
> function mgs(X)
> # mgs speed project
>
> nobs, nvars = size(X)
> R = eye(nvars)
>
> for l=1:nvars-1
> v_i = view(X,:,l+1:nvars)
> s = view(X,:,l);
>
> R[l,l+1:nvars]=BLAS.gemv('T', 1.0/sumabs2(s), v_i, s) # --- line 13
> BLAS.ger!(-1.0, s, vec(view(R,l,l+1:nvars)), v_i) # --- line 14
> end
>
> X, R
>
> end
>