I'm a new user, so have mercy in your responses.
I've written a method that takes a matrix and vector as input and then
fills in column icol of that matrix with the vector of given values that
have been shifted upward by ishift indices with periodic boundary
conditions. To make this clear, given the matrix
W = [1 2
3 4
5 6]
the vector w = [7 8 9], icol = 2 and ishift = 1, the new value of W is
given by
W = [1 8
3 9
5 7]
I need a fast way of doing this for large matrices. I wrote three methods
that should (In my naive mind) give the same performance results, but @time
reports otherwise. The method definitions and the performance results are
given below. Can someone teach me why the results are so different? The
method fill_W! is too wordy for my tastes, but the more compact notation in
fill_W1! and fill_W2! achieve poorer results. Any why do these latter two
methods allocate so much memory when the whole point of these methods is to
use already-allocated memory.
Michael
### Definitions
function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
ishift::Int)
@assert(size(W,1) == length(w), "Dimension mismatch between W and w")
W[1:(end-ishift),icol] = w[(ishift+1):end]
W[(end-(ishift-1)):end,icol] = w[1:ishift]
return
end
function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
ishift::Int)
@assert(size(W,1) == length(w), "Dimension mismatch between W and w")
[W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
[W[end-ishift+i,icol] = w[i] for i in 1:ishift]
return
end
function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
ishift::Int)
@assert(size(W,1) == length(w), "Dimension mismatch between W and w")
n = length(w)
for j in 1:(n-ishift)
W[j,icol] = w[j+ishift]
end
for j in (n-(ishift-1)):n
W[j,icol] = w[j-(n-ishift)]
end
end
# Performance Results
julia>
W = rand(1000000,2)
w = rand(1000000)
println("fill_W!:")
println(@time fill_W!(W, 2, w, 2))
println("fill_W1!:")
println(@time fill_W1!(W, 2, w, 2))
println("fill_W2!:")
println(@time fill_W2!(W, 2, w, 2))
Out>
fill_W!:
0.002801 seconds (4 allocations: 160 bytes)
nothing
fill_W1!:
0.007427 seconds (9 allocations: 7.630 MB)
[0.152463397611579,0.6314166578356002]
fill_W2!:
0.005587 seconds (7 allocations: 7.630 MB)
[0.152463397611579,0.6314166578356002]