I am writing a Julia function which implements some of the functionality of
the LAPACK function TPQRT, but my version stops before computing the block
reflectors. In my function, I make several calls to gemv! and ger! from
Base.LinAlg.BLAS. Each of these calls seems to generate extra overhead of
440 bytes according to the output of --track-allocation, and since each of
those functions gets called several times within a function that also gets
called many times, the extra allocations blow up and I end up spending a
large chunk of time in garbage collection. Is it normal for each function
call to dynamically allocate so much space? Here is the function in
question:
function tpqrf!( A::StridedMatrix{Float64}, B::StridedMatrix{Float64},
tau::StridedVector{Float64}, work::Vector{Float64} )
m,n = size(B)
for j = 1:n
# Construct Householder reflector
A[j,j], tau[j] = larfg!(A[j,j], sub(B, :, j), tau[j])
# Apply reflector to remainder
for i = j+1:n
work[i] = A[j,i]
end
Base.LinAlg.BLAS.gemv!('T', 1.0, sub(B,:,j+1:n), sub(B,:,j), 1.0,
sub(work,j+1:n))
for k = j+1:n
A[j,k] = A[j,k] - tau[j]*work[k]
end
Base.LinAlg.BLAS.ger!(-tau[j], sub(B,:,j), sub(work,j+1:n),
sub(B,:,j+1:n))
end
end
Thanks,
Aaron