Hi Yichao, what a nice idea :)

But even if I write in the C++ way,  @time sqrt(1) yields 5 allocations of 176 
bytes, and in inner loops this could be a bottleneck.

Is this an inevitable overhead of using ccall, or is it just a bogus that I can 
ignore?

Jong Wook

> On Sep 2, 2016, at 7:14 AM, Yichao Yu <[email protected]> wrote:
> 
> 
> 
> On Fri, Sep 2, 2016 at 7:03 AM, Jong Wook Kim <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
> I'm using Julia 0.4.6 on OSX El Capitan, and was trying to normalize each 
> column of matrix, so that the norm of each column becomes 1. Below is a 
> isolated and simplified version of what I'm doing:
> 
> function foo1()
>     local a = rand(1000, 10000)
>     @time for i in 1:size(a, 2)
>         a[:, i] /= norm(a[:, i])
>     end
> end
> 
> foo1()
> 0.165662 seconds (117.44 k allocations: 232.505 MB, 37.08% gc time)
> 
> I thought maybe the array copying is the problem, but this didn't help much:
> 
> function foo2()
>     local a = rand(1000, 10000)
>     @time for i in 1:size(a, 2)
>         a[:, i] /= norm(slice(a, :, i))
>     end
> end
> 
> foo2()
> 0.131377 seconds (98.47 k allocations: 155.921 MB, 36.66% gc time)
> 
> and then I figured that this ugly one runs the fastest:
> 
> function foo3()
>     local a = rand(1000, 10000)
>     @time for i in 1:size(a, 2)
>         setindex!(a, norm(slice(a, :, i)), :, i)
>     end
> end
> 
> foo3()
> 0.013814 seconds (49.49 k allocations: 1.365 MB, 4.86% gc time)
> 
> So I overheard a few times that plain for-loops are faster than vectorized 
> code in Julia, and it seems it's allocating slightly less memory, but it's 
> slower than the above.
> 
> function foo4()
>     local a = rand(1000, 10000)
>     @time @inbounds for i in 1:size(a, 2)
>         n = norm(slice(a, :, i))
>         @inbounds for j in 1:size(a, 1)
>             a[j, i] /= n
>         end
>     end
> end
> 
> foo4()
> 0.055522 seconds (30.00 k allocations: 1.068 MB, 15.14% gc time)
> 
> Is there a solution that is faster and less uglier than foo3() and foo4()?
> 
> Thinking of an equivalent implementation in C/C++, I should be able to write 
> this logic without any heap allocation. Is it possible in Julia?
> 
> You can write it in the way you'd write it in c++ and just don't use `norm`.
>  
> 
> Thanks,
> Jong Wook
> 

Reply via email to