I believe I mentioned A_mul_B! and friends. For the subtraction, check out 
NumericExtensions, specifically the subtract! function.

--Tim

On Thursday, October 23, 2014 06:43:39 AM Ján Dolinský wrote:
> > I'd like to approach this speed at the end.
> > 
> > 
> > I don't think it is possible in Julia right now without using dirty tricks
> > such as passing pointers around. You'd like to get the speed from BLAS by
> > operating on panels of your matrix, but you'd like to avoid the copying
> > and
> > reallocation of arrays. If you devectorised your code completely, there'd
> > be no allocation, but you wouldn't have the fast multiplications in BLAS.
> > You can get some of the way by using one of the ArrayViews packages, but
> > the garbage collector will charge a fee for each view you produce so
> > LAPACK
> > speed is not attainable.
> > 
> > Med venlig hilsen
> > 
> > Andreas Noack
> 
> Indeed, achieving BLAS speed is rather difficult for good reasons you
> mentioned above. Nevertheless, I'd like to match (or beat :)) the speed of
> my MGS routine written in Octave. It takes about 4 seconds in Octave and
> about 6.6 seconds in Julia. In Octave I could write it in a very straight
> forward fashion (Octave code available in the very first post of this
> thread).
> 
> Apparently most of the computing time of my Julia MGS routine is consumed
> by line #14. I tried to comment out the line 14. Computing time (and memory
> allocation too) then dropped sharply from 6.6s to 0.37s.
> 
> This indicates that line 13 is fast already in its vectorized form. This is
> very nice since it is a very compact piece of code.
> 
> I'll try to devectorize line 14 (or try to find better semi-vectorized
> representation). Any tips here appreciated.
> 
> Jan

Reply via email to