I believe I mentioned A_mul_B! and friends. For the subtraction, check out NumericExtensions, specifically the subtract! function.
--Tim On Thursday, October 23, 2014 06:43:39 AM Ján Dolinský wrote: > > I'd like to approach this speed at the end. > > > > > > I don't think it is possible in Julia right now without using dirty tricks > > such as passing pointers around. You'd like to get the speed from BLAS by > > operating on panels of your matrix, but you'd like to avoid the copying > > and > > reallocation of arrays. If you devectorised your code completely, there'd > > be no allocation, but you wouldn't have the fast multiplications in BLAS. > > You can get some of the way by using one of the ArrayViews packages, but > > the garbage collector will charge a fee for each view you produce so > > LAPACK > > speed is not attainable. > > > > Med venlig hilsen > > > > Andreas Noack > > Indeed, achieving BLAS speed is rather difficult for good reasons you > mentioned above. Nevertheless, I'd like to match (or beat :)) the speed of > my MGS routine written in Octave. It takes about 4 seconds in Octave and > about 6.6 seconds in Julia. In Octave I could write it in a very straight > forward fashion (Octave code available in the very first post of this > thread). > > Apparently most of the computing time of my Julia MGS routine is consumed > by line #14. I tried to comment out the line 14. Computing time (and memory > allocation too) then dropped sharply from 6.6s to 0.37s. > > This indicates that line 13 is fast already in its vectorized form. This is > very nice since it is a very compact piece of code. > > I'll try to devectorize line 14 (or try to find better semi-vectorized > representation). Any tips here appreciated. > > Jan
