I'm new to Julia and do not know how to file a performance issue, but I am 
happy to do it. Can you point me to the right place?

Sent from my phone

> On Jul 22, 2016, at 18:09, Stefan Karpinski <[email protected]> wrote:
> 
> Can you file a performance issue? The built-in circshift should not have 
> these performance issues.
> 
>> On Fri, Jul 22, 2016 at 4:18 PM, Michael Prange <[email protected]> wrote:
>> I just discovered that Julia already has a function for circularly shifting 
>> the data in an array: circshift(A, shifts). However, its performance is 
>> worst of all. Using this new method,
>> 
>> function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift::Int)
>>     @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>>     W[:,icol] = circshift(w,[ishift,])
>>     return
>> end
>> 
>> 
>> the resulting timings are given by (with new random numbers)
>> 
>> fill_W!: 0.002918 seconds (4 allocations: 160 bytes) 
>> fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB) 
>> fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time) 
>> fill_W3!: 0.002014 seconds (8 allocations: 352 bytes) 
>> fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time)
>> 
>> I would have expected the built-in method circshift to achieve the best 
>> results, but it is worst in all categories: time, allocations and memory.
>> 
>> Michael
>> 
>>> On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote:
>>> Gunnar,
>>> 
>>> Thank you for your explanation of the extra allocations and the tip about 
>>> sub. I implemented a version with sub as fill_W3!:
>>> 
>>> function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>>     ishift::Int)
>>>     @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>>>     W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift))
>>>     W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w))
>>>     return
>>> end
>>> 
>>> Is this what you had in mind? I reran the tests above (with new random 
>>> numbers) and had the following results:
>>> 
>>> fill_W!: 0.003234 seconds (4 allocations: 160 bytes) 
>>> fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB) 
>>> fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB) 
>>> fill_W3!: 0.002347 seconds (8 allocations: 352 bytes)
>>> 
>>> Using sub consistently achieves better times that fill_W!, even through it 
>>> uses twice the number of allocations than fill_W!. This seems to be the way 
>>> to go.
>>> 
>>> Michael
>>> 
>>>> On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote:
>>>> fill_W1! allocates memory because it makes copies when constructing the 
>>>> right hand sides. fill_W2 allocates memory in order to construct the 
>>>> comprehensions (that you then discard). In both cases memory allocation 
>>>> could plausibly be avoided by a sufficiently smart compiler, but until 
>>>> Julia becomes that smart, have a look at the sub function to provide views 
>>>> instead of copies for the right hand sides of fill_W1!.
>>>> 
>>>>> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote:
>>>>> I'm a new user, so have mercy in your responses. 
>>>>> 
>>>>> I've written a method that takes a matrix and vector as input and then 
>>>>> fills in column icol of that matrix with the vector of given values that 
>>>>> have been shifted upward by ishift indices with periodic boundary 
>>>>> conditions. To make this clear, given the matrix
>>>>> 
>>>>> W = [1  2
>>>>>         3  4
>>>>>         5  6]
>>>>> 
>>>>> the vector w = [7  8  9], icol = 2 and ishift = 1, the new value of W is 
>>>>> given by
>>>>> 
>>>>> W = [1  8
>>>>>         3  9
>>>>>         5  7]
>>>>> 
>>>>> I need a fast way of doing this for large matrices. I wrote three methods 
>>>>> that should (In my naive mind) give the same performance results, but 
>>>>> @time reports otherwise.  The method definitions and the performance 
>>>>> results are given below. Can someone teach me why the results are so 
>>>>> different? The method fill_W! is too wordy for my tastes, but the more 
>>>>> compact notation in fill_W1! and fill_W2! achieve poorer results. Any why 
>>>>> do these latter two methods allocate so much memory when the whole point 
>>>>> of these methods is to use already-allocated memory.
>>>>> 
>>>>> Michael
>>>>> 
>>>>> ### Definitions
>>>>> 
>>>>> 
>>>>> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>>>>     ishift::Int)
>>>>>     @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>>>>>     W[1:(end-ishift),icol] = w[(ishift+1):end]
>>>>>     W[(end-(ishift-1)):end,icol] = w[1:ishift]
>>>>>     return
>>>>> end
>>>>> 
>>>>> 
>>>>> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>>>>     ishift::Int)
>>>>>     @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>>>>>     [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)]
>>>>>     [W[end-ishift+i,icol] = w[i] for i in 1:ishift]
>>>>>     return
>>>>> end
>>>>> 
>>>>> 
>>>>> function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, 
>>>>>     ishift::Int)
>>>>>     @assert(size(W,1) == length(w), "Dimension mismatch between W and w")
>>>>>     n = length(w)
>>>>>     for j in 1:(n-ishift)
>>>>>         W[j,icol] = w[j+ishift]
>>>>>     end
>>>>>     for j in (n-(ishift-1)):n
>>>>>         W[j,icol] = w[j-(n-ishift)]
>>>>>     end
>>>>> end
>>>>> 
>>>>> 
>>>>> # Performance Results
>>>>> julia>
>>>>> W = rand(1000000,2)
>>>>> w = rand(1000000)
>>>>> println("fill_W!:")
>>>>> println(@time fill_W!(W, 2, w, 2))
>>>>> println("fill_W1!:")
>>>>> println(@time fill_W1!(W, 2, w, 2))
>>>>> println("fill_W2!:")
>>>>> println(@time fill_W2!(W, 2, w, 2))
>>>>> 
>>>>> 
>>>>> Out>
>>>>> fill_W!:
>>>>>  0.002801 seconds (4 allocations: 160 bytes)
>>>>> nothing
>>>>> fill_W1!:
>>>>>  0.007427 seconds (9 allocations: 7.630 MB)
>>>>> [0.152463397611579,0.6314166578356002]
>>>>> fill_W2!:
>>>>>  0.005587 seconds (7 allocations: 7.630 MB)
>>>>> [0.152463397611579,0.6314166578356002]
> 

Reply via email to