Re: [julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-23 Thread Jeffrey Sarnoff
Michael, 
I have been in that position -- let me file the performance issue on your 
behalf.   
Then you can follow this link 
 to see what I did to make 
it an issue. 

   Welcome,
 Jeffrey

On Saturday, July 23, 2016 at 11:36:13 AM UTC-4, Stefan Karpinski wrote:
>
> Go here: https://github.com/JuliaLang/julia/issues/new; describe the 
> issue (much as you did here) and submit. Thank you!
>
> On Fri, Jul 22, 2016 at 8:58 PM, Michael Prange  > wrote:
>
>> I'm new to Julia and do not know how to file a performance issue, but I 
>> am happy to do it. Can you point me to the right place?
>>
>> Sent from my phone
>>
>> On Jul 22, 2016, at 18:09, Stefan Karpinski > > wrote:
>>
>> Can you file a performance issue? The built-in circshift should not have 
>> these performance issues.
>>
>> On Fri, Jul 22, 2016 at 4:18 PM, Michael Prange > > wrote:
>>
>>> I just discovered that Julia already has a function for circularly 
>>> shifting the data in an array: circshift(A, shifts). However, its 
>>> performance is worst of all. Using this new method,
>>>
>>> function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift::
>>> Int)
>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w"
>>> )
>>> W[:,icol] = circshift(w,[ishift,])
>>> return
>>> end
>>>
>>>
>>> the resulting timings are given by (with new random numbers)
>>>
>>> fill_W!: 0.002918 seconds (4 allocations: 160 bytes) 
>>> fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB) 
>>> fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time) 
>>> fill_W3!: 0.002014 seconds (8 allocations: 352 bytes) 
>>> fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time)
>>>
>>>
>>> I would have expected the built-in method circshift to achieve the best 
>>> results, but it is worst in all categories: time, allocations and memory.
>>>
>>> Michael
>>>
>>> On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote:

 Gunnar,

 Thank you for your explanation of the extra allocations and the tip 
 about sub. I implemented a version with sub as fill_W3!:

 function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
 ishift::Int)
 @assert(size(W,1) == length(w), "Dimension mismatch between W and 
 w")
 W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
 W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
 return
 end

 Is this what you had in mind? I reran the tests above (with new random 
 numbers) and had the following results:

 fill_W!: 0.003234 seconds (4 allocations: 160 bytes) 
 fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB) 
 fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB) 
 fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)

 Using sub consistently achieves better times that fill_W!, even through it 
 uses twice the number of allocations than fill_W!. This seems to be the 
 way to go.


 Michael


 On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:
>
> fill_W1! allocates memory because it makes copies when constructing 
> the right hand sides. fill_W2 allocates memory in order to construct the 
> comprehensions (that you then discard). In both cases memory allocation 
> could plausibly be avoided by a sufficiently smart compiler, but until 
> Julia becomes that smart, have a look at the sub function to provide 
> views 
> instead of copies for the right hand sides of fill_W1!.
>
> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
>>
>> I'm a new user, so have mercy in your responses. 
>>
>> I've written a method that takes a matrix and vector as input and 
>> then fills in column icol of that matrix with the vector of given values 
>> that have been shifted upward by ishift indices with periodic boundary 
>> conditions. To make this clear, given the matrix
>>
>> W = [1  2
>> 3  4
>> 5  6]
>>
>> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W 
>> is given by
>>
>> W = [1  8
>> 3  9
>> 5  7]
>>
>> I need a fast way of doing this for large matrices. I wrote three 
>> methods that should (In my naive mind) give the same performance 
>> results, 
>> but @time reports otherwise.  The method definitions and the performance 
>> results are given below. Can someone teach me why the results are so 
>> different? The method fill_W! is too wordy for my tastes, but the more 
>> compact notation in fill_W1! and fill_W2! achieve poorer results. Any 
>> why 
>> do these latter two methods allocate so much memory when the whole point 
>> of 
>> these methods is to use 

Re: [julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-23 Thread Stefan Karpinski
Go here: https://github.com/JuliaLang/julia/issues/new; describe the issue
(much as you did here) and submit. Thank you!

On Fri, Jul 22, 2016 at 8:58 PM, Michael Prange  wrote:

> I'm new to Julia and do not know how to file a performance issue, but I am
> happy to do it. Can you point me to the right place?
>
> Sent from my phone
>
> On Jul 22, 2016, at 18:09, Stefan Karpinski  wrote:
>
> Can you file a performance issue? The built-in circshift should not have
> these performance issues.
>
> On Fri, Jul 22, 2016 at 4:18 PM, Michael Prange 
> wrote:
>
>> I just discovered that Julia already has a function for circularly
>> shifting the data in an array: circshift(A, shifts). However, its
>> performance is worst of all. Using this new method,
>>
>> function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift::
>> Int)
>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>> W[:,icol] = circshift(w,[ishift,])
>> return
>> end
>>
>>
>> the resulting timings are given by (with new random numbers)
>>
>> fill_W!: 0.002918 seconds (4 allocations: 160 bytes)
>> fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB)
>> fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time)
>> fill_W3!: 0.002014 seconds (8 allocations: 352 bytes)
>> fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time)
>>
>>
>> I would have expected the built-in method circshift to achieve the best
>> results, but it is worst in all categories: time, allocations and memory.
>>
>> Michael
>>
>> On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote:
>>>
>>> Gunnar,
>>>
>>> Thank you for your explanation of the extra allocations and the tip
>>> about sub. I implemented a version with sub as fill_W3!:
>>>
>>> function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
>>> ishift::Int)
>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w"
>>> )
>>> W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
>>> W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
>>> return
>>> end
>>>
>>> Is this what you had in mind? I reran the tests above (with new random
>>> numbers) and had the following results:
>>>
>>> fill_W!: 0.003234 seconds (4 allocations: 160 bytes)
>>> fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB)
>>> fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB)
>>> fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)
>>>
>>> Using sub consistently achieves better times that fill_W!, even through it 
>>> uses twice the number of allocations than fill_W!. This seems to be the way 
>>> to go.
>>>
>>>
>>> Michael
>>>
>>>
>>> On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:

 fill_W1! allocates memory because it makes copies when constructing the
 right hand sides. fill_W2 allocates memory in order to construct the
 comprehensions (that you then discard). In both cases memory allocation
 could plausibly be avoided by a sufficiently smart compiler, but until
 Julia becomes that smart, have a look at the sub function to provide views
 instead of copies for the right hand sides of fill_W1!.

 On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
>
> I'm a new user, so have mercy in your responses.
>
> I've written a method that takes a matrix and vector as input and then
> fills in column icol of that matrix with the vector of given values that
> have been shifted upward by ishift indices with periodic boundary
> conditions. To make this clear, given the matrix
>
> W = [1  2
> 3  4
> 5  6]
>
> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W
> is given by
>
> W = [1  8
> 3  9
> 5  7]
>
> I need a fast way of doing this for large matrices. I wrote three
> methods that should (In my naive mind) give the same performance results,
> but @time reports otherwise.  The method definitions and the performance
> results are given below. Can someone teach me why the results are so
> different? The method fill_W! is too wordy for my tastes, but the more
> compact notation in fill_W1! and fill_W2! achieve poorer results. Any why
> do these latter two methods allocate so much memory when the whole point 
> of
> these methods is to use already-allocated memory.
>
> Michael
>
> ### Definitions
>
>
> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and
> w")
> W[1:(end-ishift),icol] = w[(ishift+1):end]
> W[(end-(ishift-1)):end,icol] = w[1:ishift]
> return
> end
>
>
> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
> ishift::Int)
> 

Re: [julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-22 Thread Michael Prange
I'm new to Julia and do not know how to file a performance issue, but I am 
happy to do it. Can you point me to the right place?

Sent from my phone

> On Jul 22, 2016, at 18:09, Stefan Karpinski  wrote:
> 
> Can you file a performance issue? The built-in circshift should not have 
> these performance issues.
> 
>> On Fri, Jul 22, 2016 at 4:18 PM, Michael Prange  wrote:
>> I just discovered that Julia already has a function for circularly shifting 
>> the data in an array: circshift(A, shifts). However, its performance is 
>> worst of all. Using this new method,
>> 
>> function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift::Int)
>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>> W[:,icol] = circshift(w,[ishift,])
>> return
>> end
>> 
>> 
>> the resulting timings are given by (with new random numbers)
>> 
>> fill_W!: 0.002918 seconds (4 allocations: 160 bytes) 
>> fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB) 
>> fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time) 
>> fill_W3!: 0.002014 seconds (8 allocations: 352 bytes) 
>> fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time)
>> 
>> I would have expected the built-in method circshift to achieve the best 
>> results, but it is worst in all categories: time, allocations and memory.
>> 
>> Michael
>> 
>>> On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote:
>>> Gunnar,
>>> 
>>> Thank you for your explanation of the extra allocations and the tip about 
>>> sub. I implemented a version with sub as fill_W3!:
>>> 
>>> function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>> ishift::Int)
>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>>> W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
>>> W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
>>> return
>>> end
>>> 
>>> Is this what you had in mind? I reran the tests above (with new random 
>>> numbers) and had the following results:
>>> 
>>> fill_W!: 0.003234 seconds (4 allocations: 160 bytes) 
>>> fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB) 
>>> fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB) 
>>> fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)
>>> 
>>> Using sub consistently achieves better times that fill_W!, even through it 
>>> uses twice the number of allocations than fill_W!. This seems to be the way 
>>> to go.
>>> 
>>> Michael
>>> 
 On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:
 fill_W1! allocates memory because it makes copies when constructing the 
 right hand sides. fill_W2 allocates memory in order to construct the 
 comprehensions (that you then discard). In both cases memory allocation 
 could plausibly be avoided by a sufficiently smart compiler, but until 
 Julia becomes that smart, have a look at the sub function to provide views 
 instead of copies for the right hand sides of fill_W1!.
 
> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
> I'm a new user, so have mercy in your responses. 
> 
> I've written a method that takes a matrix and vector as input and then 
> fills in column icol of that matrix with the vector of given values that 
> have been shifted upward by ishift indices with periodic boundary 
> conditions. To make this clear, given the matrix
> 
> W = [1  2
> 3  4
> 5  6]
> 
> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W is 
> given by
> 
> W = [1  8
> 3  9
> 5  7]
> 
> I need a fast way of doing this for large matrices. I wrote three methods 
> that should (In my naive mind) give the same performance results, but 
> @time reports otherwise.  The method definitions and the performance 
> results are given below. Can someone teach me why the results are so 
> different? The method fill_W! is too wordy for my tastes, but the more 
> compact notation in fill_W1! and fill_W2! achieve poorer results. Any why 
> do these latter two methods allocate so much memory when the whole point 
> of these methods is to use already-allocated memory.
> 
> Michael
> 
> ### Definitions
> 
> 
> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> W[1:(end-ishift),icol] = w[(ishift+1):end]
> W[(end-(ishift-1)):end,icol] = w[1:ishift]
> return
> end
> 
> 
> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
> [W[end-ishift+i,icol] = w[i] for i in 1:ishift]
> 

Re: [julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-22 Thread Stefan Karpinski
Can you file a performance issue? The built-in circshift should not have
these performance issues.

On Fri, Jul 22, 2016 at 4:18 PM, Michael Prange  wrote:

> I just discovered that Julia already has a function for circularly
> shifting the data in an array: circshift(A, shifts). However, its
> performance is worst of all. Using this new method,
>
> function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift::Int
> )
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> W[:,icol] = circshift(w,[ishift,])
> return
> end
>
>
> the resulting timings are given by (with new random numbers)
>
> fill_W!: 0.002918 seconds (4 allocations: 160 bytes)
> fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB)
> fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time)
> fill_W3!: 0.002014 seconds (8 allocations: 352 bytes)
> fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time)
>
>
> I would have expected the built-in method circshift to achieve the best
> results, but it is worst in all categories: time, allocations and memory.
>
> Michael
>
> On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote:
>>
>> Gunnar,
>>
>> Thank you for your explanation of the extra allocations and the tip about
>> sub. I implemented a version with sub as fill_W3!:
>>
>> function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
>> ishift::Int)
>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>> W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
>> W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
>> return
>> end
>>
>> Is this what you had in mind? I reran the tests above (with new random
>> numbers) and had the following results:
>>
>> fill_W!: 0.003234 seconds (4 allocations: 160 bytes)
>> fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB)
>> fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB)
>> fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)
>>
>> Using sub consistently achieves better times that fill_W!, even through it 
>> uses twice the number of allocations than fill_W!. This seems to be the way 
>> to go.
>>
>>
>> Michael
>>
>>
>> On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:
>>>
>>> fill_W1! allocates memory because it makes copies when constructing the
>>> right hand sides. fill_W2 allocates memory in order to construct the
>>> comprehensions (that you then discard). In both cases memory allocation
>>> could plausibly be avoided by a sufficiently smart compiler, but until
>>> Julia becomes that smart, have a look at the sub function to provide views
>>> instead of copies for the right hand sides of fill_W1!.
>>>
>>> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:

 I'm a new user, so have mercy in your responses.

 I've written a method that takes a matrix and vector as input and then
 fills in column icol of that matrix with the vector of given values that
 have been shifted upward by ishift indices with periodic boundary
 conditions. To make this clear, given the matrix

 W = [1  2
 3  4
 5  6]

 the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W
 is given by

 W = [1  8
 3  9
 5  7]

 I need a fast way of doing this for large matrices. I wrote three
 methods that should (In my naive mind) give the same performance results,
 but @time reports otherwise.  The method definitions and the performance
 results are given below. Can someone teach me why the results are so
 different? The method fill_W! is too wordy for my tastes, but the more
 compact notation in fill_W1! and fill_W2! achieve poorer results. Any why
 do these latter two methods allocate so much memory when the whole point of
 these methods is to use already-allocated memory.

 Michael

 ### Definitions


 function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
 ishift::Int)
 @assert(size(W,1) == length(w), "Dimension mismatch between W and
 w")
 W[1:(end-ishift),icol] = w[(ishift+1):end]
 W[(end-(ishift-1)):end,icol] = w[1:ishift]
 return
 end


 function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
 ishift::Int)
 @assert(size(W,1) == length(w), "Dimension mismatch between W and
 w")
 [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
 [W[end-ishift+i,icol] = w[i] for i in 1:ishift]
 return
 end


 function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF},
 ishift::Int)
 @assert(size(W,1) == length(w), "Dimension mismatch between W and
 w")
 n = length(w)
 for j in 1:(n-ishift)
 W[j,icol] = w[j+ishift]
 end
 for j in (n-(ishift-1)):n
 W[j,icol] = 

[julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-22 Thread Michael Prange
I just discovered that Julia already has a function for circularly shifting 
the data in an array: circshift(A, shifts). However, its performance is 
worst of all. Using this new method,

function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift::Int)
@assert(size(W,1) == length(w), "Dimension mismatch between W and w")
W[:,icol] = circshift(w,[ishift,])
return
end


the resulting timings are given by (with new random numbers)

fill_W!: 0.002918 seconds (4 allocations: 160 bytes) 
fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB) 
fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time) 
fill_W3!: 0.002014 seconds (8 allocations: 352 bytes) 
fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time)


I would have expected the built-in method circshift to achieve the best 
results, but it is worst in all categories: time, allocations and memory.

Michael

On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote:
>
> Gunnar,
>
> Thank you for your explanation of the extra allocations and the tip about 
> sub. I implemented a version with sub as fill_W3!:
>
> function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
> W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
> return
> end
>
> Is this what you had in mind? I reran the tests above (with new random 
> numbers) and had the following results:
>
> fill_W!: 0.003234 seconds (4 allocations: 160 bytes) 
> fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB) 
> fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB) 
> fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)
>
> Using sub consistently achieves better times that fill_W!, even through it 
> uses twice the number of allocations than fill_W!. This seems to be the way 
> to go.
>
>
> Michael
>
>
> On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:
>>
>> fill_W1! allocates memory because it makes copies when constructing the 
>> right hand sides. fill_W2 allocates memory in order to construct the 
>> comprehensions (that you then discard). In both cases memory allocation 
>> could plausibly be avoided by a sufficiently smart compiler, but until 
>> Julia becomes that smart, have a look at the sub function to provide views 
>> instead of copies for the right hand sides of fill_W1!.
>>
>> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
>>>
>>> I'm a new user, so have mercy in your responses. 
>>>
>>> I've written a method that takes a matrix and vector as input and then 
>>> fills in column icol of that matrix with the vector of given values that 
>>> have been shifted upward by ishift indices with periodic boundary 
>>> conditions. To make this clear, given the matrix
>>>
>>> W = [1  2
>>> 3  4
>>> 5  6]
>>>
>>> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W is 
>>> given by
>>>
>>> W = [1  8
>>> 3  9
>>> 5  7]
>>>
>>> I need a fast way of doing this for large matrices. I wrote three 
>>> methods that should (In my naive mind) give the same performance results, 
>>> but @time reports otherwise.  The method definitions and the performance 
>>> results are given below. Can someone teach me why the results are so 
>>> different? The method fill_W! is too wordy for my tastes, but the more 
>>> compact notation in fill_W1! and fill_W2! achieve poorer results. Any why 
>>> do these latter two methods allocate so much memory when the whole point of 
>>> these methods is to use already-allocated memory.
>>>
>>> Michael
>>>
>>> ### Definitions
>>>
>>>
>>> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>> ishift::Int)
>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w"
>>> )
>>> W[1:(end-ishift),icol] = w[(ishift+1):end]
>>> W[(end-(ishift-1)):end,icol] = w[1:ishift]
>>> return
>>> end
>>>
>>>
>>> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>> ishift::Int)
>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w"
>>> )
>>> [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
>>> [W[end-ishift+i,icol] = w[i] for i in 1:ishift]
>>> return
>>> end
>>>
>>>
>>> function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>> ishift::Int)
>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w"
>>> )
>>> n = length(w)
>>> for j in 1:(n-ishift)
>>> W[j,icol] = w[j+ishift]
>>> end
>>> for j in (n-(ishift-1)):n
>>> W[j,icol] = w[j-(n-ishift)]
>>> end
>>> end
>>>
>>>
>>> # Performance Results
>>> julia>
>>> W = rand(100,2)
>>> w = rand(100)
>>> println("fill_W!:")
>>> println(@time fill_W!(W, 2, w, 2))
>>> println("fill_W1!:")
>>> println(@time fill_W1!(W, 2, w, 2))
>>> println("fill_W2!:")
>>> println(@time 

[julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-22 Thread Michael Prange
Gunnar,

Thank you for your explanation of the extra allocations and the tip about 
sub. I implemented a version with sub as fill_W3!:

function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
ishift::Int)
@assert(size(W,1) == length(w), "Dimension mismatch between W and w")
W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
return
end

Is this what you had in mind? I reran the tests above (with new random 
numbers) and had the following results:

fill_W!: 0.003234 seconds (4 allocations: 160 bytes) 
fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB) 
fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB) 
fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)

Using sub consistently achieves better times that fill_W!, even through it uses 
twice the number of allocations than fill_W!. This seems to be the way to go.


Michael


On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:
>
> fill_W1! allocates memory because it makes copies when constructing the 
> right hand sides. fill_W2 allocates memory in order to construct the 
> comprehensions (that you then discard). In both cases memory allocation 
> could plausibly be avoided by a sufficiently smart compiler, but until 
> Julia becomes that smart, have a look at the sub function to provide views 
> instead of copies for the right hand sides of fill_W1!.
>
> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
>>
>> I'm a new user, so have mercy in your responses. 
>>
>> I've written a method that takes a matrix and vector as input and then 
>> fills in column icol of that matrix with the vector of given values that 
>> have been shifted upward by ishift indices with periodic boundary 
>> conditions. To make this clear, given the matrix
>>
>> W = [1  2
>> 3  4
>> 5  6]
>>
>> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W is 
>> given by
>>
>> W = [1  8
>> 3  9
>> 5  7]
>>
>> I need a fast way of doing this for large matrices. I wrote three methods 
>> that should (In my naive mind) give the same performance results, but @time 
>> reports otherwise.  The method definitions and the performance results are 
>> given below. Can someone teach me why the results are so different? The 
>> method fill_W! is too wordy for my tastes, but the more compact notation in 
>> fill_W1! and fill_W2! achieve poorer results. Any why do these latter two 
>> methods allocate so much memory when the whole point of these methods is to 
>> use already-allocated memory.
>>
>> Michael
>>
>> ### Definitions
>>
>>
>> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>> ishift::Int)
>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>> W[1:(end-ishift),icol] = w[(ishift+1):end]
>> W[(end-(ishift-1)):end,icol] = w[1:ishift]
>> return
>> end
>>
>>
>> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>> ishift::Int)
>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>> [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
>> [W[end-ishift+i,icol] = w[i] for i in 1:ishift]
>> return
>> end
>>
>>
>> function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>> ishift::Int)
>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>> n = length(w)
>> for j in 1:(n-ishift)
>> W[j,icol] = w[j+ishift]
>> end
>> for j in (n-(ishift-1)):n
>> W[j,icol] = w[j-(n-ishift)]
>> end
>> end
>>
>>
>> # Performance Results
>> julia>
>> W = rand(100,2)
>> w = rand(100)
>> println("fill_W!:")
>> println(@time fill_W!(W, 2, w, 2))
>> println("fill_W1!:")
>> println(@time fill_W1!(W, 2, w, 2))
>> println("fill_W2!:")
>> println(@time fill_W2!(W, 2, w, 2))
>>
>>
>> Out>
>> fill_W!:
>>  0.002801 seconds (4 allocations: 160 bytes)
>> nothing
>> fill_W1!:
>>  0.007427 seconds (9 allocations: 7.630 MB)
>> [0.152463397611579,0.6314166578356002]
>> fill_W2!:
>>  0.005587 seconds (7 allocations: 7.630 MB)
>> [0.152463397611579,0.6314166578356002]
>>
>>
>>

[julia-users] Re: Strange performance issue in filling in a matrix column

2016-07-21 Thread Gunnar Farnebäck
fill_W1! allocates memory because it makes copies when constructing the 
right hand sides. fill_W2 allocates memory in order to construct the 
comprehensions (that you then discard). In both cases memory allocation 
could plausibly be avoided by a sufficiently smart compiler, but until 
Julia becomes that smart, have a look at the sub function to provide views 
instead of copies for the right hand sides of fill_W1!.

On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
>
> I'm a new user, so have mercy in your responses. 
>
> I've written a method that takes a matrix and vector as input and then 
> fills in column icol of that matrix with the vector of given values that 
> have been shifted upward by ishift indices with periodic boundary 
> conditions. To make this clear, given the matrix
>
> W = [1  2
> 3  4
> 5  6]
>
> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W is 
> given by
>
> W = [1  8
> 3  9
> 5  7]
>
> I need a fast way of doing this for large matrices. I wrote three methods 
> that should (In my naive mind) give the same performance results, but @time 
> reports otherwise.  The method definitions and the performance results are 
> given below. Can someone teach me why the results are so different? The 
> method fill_W! is too wordy for my tastes, but the more compact notation in 
> fill_W1! and fill_W2! achieve poorer results. Any why do these latter two 
> methods allocate so much memory when the whole point of these methods is to 
> use already-allocated memory.
>
> Michael
>
> ### Definitions
>
>
> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> W[1:(end-ishift),icol] = w[(ishift+1):end]
> W[(end-(ishift-1)):end,icol] = w[1:ishift]
> return
> end
>
>
> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
> [W[end-ishift+i,icol] = w[i] for i in 1:ishift]
> return
> end
>
>
> function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
> ishift::Int)
> @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
> n = length(w)
> for j in 1:(n-ishift)
> W[j,icol] = w[j+ishift]
> end
> for j in (n-(ishift-1)):n
> W[j,icol] = w[j-(n-ishift)]
> end
> end
>
>
> # Performance Results
> julia>
> W = rand(100,2)
> w = rand(100)
> println("fill_W!:")
> println(@time fill_W!(W, 2, w, 2))
> println("fill_W1!:")
> println(@time fill_W1!(W, 2, w, 2))
> println("fill_W2!:")
> println(@time fill_W2!(W, 2, w, 2))
>
>
> Out>
> fill_W!:
>  0.002801 seconds (4 allocations: 160 bytes)
> nothing
> fill_W1!:
>  0.007427 seconds (9 allocations: 7.630 MB)
> [0.152463397611579,0.6314166578356002]
> fill_W2!:
>  0.005587 seconds (7 allocations: 7.630 MB)
> [0.152463397611579,0.6314166578356002]
>
>
>