> > This works for me: > > julia> function mindists_sq(pos, dists_min, Acp) > > for i in 1:size(pos, 2) > dists_min[i] = Inf > for j in 1:size(Acp, 2) > t = 0.0 > for k=1:size(pos,1) > t += (pos[k, i]-Acp[k, j])^2 > end > if t < dists_min[i] > dists_min[i] = t > end > end > end > return dists_min > end > mindists_sq (generic function with 1 method) > > julia> function test() > const pos = rand(3, 64) > const Acp = rand(3, 1200) > const dists_min = zeros(64) > const tmp = zeros(typeof(Acp[1]), 1) > @time mindists_sq(pos, dists_min, Acp) > end > test (generic function with 1 method) > > julia> test(); > elapsed time: 0.001279041 seconds (0 bytes allocated) > > Is this how you unrolled the innermost loop too? >
No, I've been even further: t += pos[k, i] t -= Acp[k, j] t ^= 2 But it does not change anything (no allocations :D). Thank you. I guess t in this case is allocated on the stack. I'm surprised that if I allocate t on the heap I get such a large performance penalty (10-fold slow down). I'm also surprised that t[:] += pos[k, i] allocates memory (and incurs an even larger performance penalty). I really thought this was the recommended way to perform inplace operations on arrays. So, is there any tool like Julia profiler but for memory allocation?
