I think there are two major reasons. First of all the slicing makes copies of the arrays. Hence you end up with a lot of temporary arrays. Try @time lufact!(B1) vs @time crout(B1) and see if the allocation difference.
The second reason is that lufact calls LAPACK which uses a blocked algorithm that can benefit from BLAS 3 calls. Your code is doing matrix-vector multiplications where LAPACK tries to make as many matrix-matrix multiplications as possible. Med venlig hilsen Andreas Noack 2014-09-25 12:52 GMT-04:00 Staro Pickle <[email protected]>: > Hi. > > I am doing some matrix factorizations like LU, LDL'. I find that the > julia's lufact() is much faster than my code. I wonder why my code is slow > and why lufact is fast(is this one question or two?). I use > vectorization(is that vectorization?). > > I have to mention that I use *Crout* method which is a little different > from lufact(). I do no pivoting because the leading principal minors of my > matrix here are nonsingular(see below). And my U is a *unit upper* > triangular while lufact() makes L a *unit **lower *triangular. But I > don't think these differences are responsible for the huge disparity > between their speed. > > Also, I overwrite A to save space. Will this save time or take more? > > n=5, a=10 > Matrix to be decomposed: > 1 0 0 0 a > 0 1 0 0 a > 0 0 1 0 a > 0 0 0 1 a > a a a a 1 > > My code: > > function create_B1(n,a) > B = hcat(eye(n-1), a*ones(n-1,1)) > B = vcat(B, hcat(a*ones(1,n-1), 1)) > return(B::Array{Float64,2}) > end > > function crout(A::Array{Float64,2}) > n = size(A,1) > A[1,2:n] = A[1,2:n] / A[1,1] > for j = 2:n > A[j:n,j] -= A[j:n,1:j-1] * A[1:j-1,j] > A[j,j+1:n] = (A[j,j+1:n] - A[j,1:j-1]*A[1:j-1,j+1:n]) / A[j,j] > end > return(A::Array{Float64,2}) > end > > B1 = create_B1(1000,10) > > @printf "My Crout: %0.5fs\n" @elapsed(crout(B1)) > @printf "lufact(): %0.5fs\n" @elapsed(lufact(B1)) > > > The result is: > > My Crout: 2.63706s > lufact(): 0.08283s > > > > This is the environment: > > julia> versioninfo() > Julia Version 0.3.0 > Commit 7681878* (2014-08-20 20:43 UTC) > Platform Info: > System: Darwin (x86_64-apple-darwin13.3.0) > CPU: Intel(R) Core(TM) i5-4288U CPU @ 2.60GHz > WORD_SIZE: 64 > BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) > LAPACK: libopenblas > LIBM: libopenlibm > LLVM: libLLVM-3.3 > > > > Can anyone help me with all above? Thanks a lot! >
