Completely devectorisation can sometimes be useful, but I have come to
prefer writing linear algebra code close to the style of LAPACK where you
organise the calculations in BLAS calls. Not only for speed, but also for
making the code easier to read and infer the structure of the algorithm.

Med venlig hilsen

Andreas Noack

2014-10-23 11:05 GMT-04:00 Ján Dolinský <[email protected]>:

> Dňa štvrtok, 23. októbra 2014 16:21:11 UTC+2 Andreas Noack napísal(-a):
>>
>> I missed that you were already using the ArrayViews package. Sorry for
>> that.
>>
>> The way you have written the rank one update in line 14 is very
>> expensive. Unfortunately, we are not making all of BLAS available through
>> our generic mat-vec interface right now, but the whole of BLAS is wrapped
>> in the BLAS module, so you can use the ger! function there, i.e.
>>
>>     function mgs2(X)
>>     # mgs speed project
>>
>>         nobs, nvars = size(X)
>>         R = eye(nvars)
>>
>>         for l=1:nvars-1
>>             v_i = view(X,:,l+1:nvars) # --- line 10
>>             s = view(X,:,l);
>>
>>             R[l,l+1:nvars] = (v_i'*s)/sumabs2(s) # --- line 13
>>             BLAS.ger!(-1.0, s, vec(view(R,l,l+1:nvars)), v_i) # --- line
>> 14
>>         end
>>
>>         X, R
>>
>>     end
>>
>> which avoids a lot of allocation. I don't know how Octave handles such
>> expression, so I don't know why they are faster when not calling into BLAS
>> directly.
>>
>> Med venlig hilsen
>>
>> Andreas Noack
>>
>
> Indeed line #14 is very expensive. I see it now :). Thanks for the BLAS
> tip. I am just getting familiar with the BLAS and concepts of de-vectorized
> code. Last five years I used  to use Matlab/Octave and vectorization and it
> turned into a rather strong habit :). I may also try to completely
> de-vectorize the code.
>
> Thanks,
> Jan
>

Reply via email to