The code in Distances.jl is quite heavily optimized and uses BLAS calls when possible (which it is for Euclidean metric). Your code has many allocations like x = x' and norm(x[:,i] - x[:,j]).
On Wednesday, September 7, 2016 at 1:43:11 PM UTC+2, Weicheng Zhu wrote: > > Hi there, > I write a function to calculate the distance for each row of a two > dimensional array and I compared it with the `pairwise` function in the > Distance module. > Does anyone can help me to find out the reason why my function is slower > than the pairwise function? I only keep the triangle elements of the > distance matrix which I thought should be faster. Thanks in advance for any > help:) > > Here is the code: > > Module Tmp > > import DataFrames: DataFrame > > function dist(x::Matrix) > > x = x' > > n = size(x, 2) > > ij::UInt = 0 > > d = zeros(convert(Int, (n-1)*n/2)) > > for i in 1:n > > for j in (i+1):n > > ij += 1 > > d[ij] = norm(x[:,i] - x[:,j]) > > end > > end > > return d > > end > > > function dist(x::DataFrame) > > dist(convert(Array, dat)) > > end > > export dist > > end > > > using Tmp > > using Distances > > x = rand(100,2) > > @time dist(x) > > # 0.001581 seconds (29.71 k allocations: 1.399 MB) > > @time pairwise(Euclidean(), x') > > # 0.000318 seconds (310 allocations: 91.984 KB) > > > > > > > >
