I think the distance matrix version below is about as good
as it gets with these basic strategies.
fwiw,
Alan Isaac
def dist(A,B):
rowsA, rowsB = A.shape[0], B.shape[0]
distanceAB = empty( [rowsA,rowsB] , dtype=float)
if rowsA = rowsB:
temp = empty_like(B)
for i in
I just ran Alan's script and I don't get consistent results for 100
repetitions. I boosted it to 1000, and ran it several times. The
faster one varied alot, but both came into a ~ +-1.5% difference.
When it comes to scaling, for my problem(fuzzy clustering), N is the
size of the dataset, which
I checked the matlab version's code and it does the same as discussed
here. The only thing to check is to make sure you loop around the
shorter dimension of the output array. Speedwise the Matlab code still
runs about twice as fast for large sets of data (by just taking time
by hand and
On Sun, 18 Jun 2006, Sebastian Beca apparently wrote:
def dist():
d = zeros([N, C], dtype=float)
if N C: for i in range(N):
xy = A[i] - B d[i,:] = sqrt(sum(xy**2, axis=1))
return d
else:
for j in range(C):
xy = A - B[j] d[:,j] = sqrt(sum(xy**2, axis=1))
return d
But that is 50%
Alan G Isaac wrote:
On Sun, 18 Jun 2006, Sebastian Beca apparently wrote:
def dist():
d = zeros([N, C], dtype=float)
if N C: for i in range(N):
xy = A[i] - B d[i,:] = sqrt(sum(xy**2, axis=1))
return d
else:
for j in range(C):
xy = A - B[j] d[:,j] = sqrt(sum(xy**2, axis=1))
return d
On Sun, 18 Jun 2006, Tim Hochberg apparently wrote:
Alan G Isaac wrote:
On Sun, 18 Jun 2006, Sebastian Beca apparently wrote:
def dist():
d = zeros([N, C], dtype=float)
if N C: for i in range(N):
xy = A[i] - B d[i,:] = sqrt(sum(xy**2, axis=1))
return d
else:
for j in
Hi,
def d4():
d = zeros([4, 1000], dtype=float)
for i in range(4):
xy = A[i] - B
d[i] = sqrt( sum(xy**2, axis=1) )
return d
Maybe there's another alternative to d4?
Thanks again,
I think this is the fastest you can get. Maybe it would be nicer to use
the
How about this?
def d5():
return add.outer(sum(A*A, axis=1), sum(B*B, axis=1)) - \
2.*dot(A, transpose(B))
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
Alex Cannon wrote:
How about this?
def d5():
return add.outer(sum(A*A, axis=1), sum(B*B, axis=1)) - \
2.*dot(A, transpose(B))
You might lose some precision with that approach, so the OP should compare
results and timings to look at the tradeoffs.
--
Robert
Hi,
I'm working with NumPy/SciPy on some algorithms and i've run into some
important speed differences wrt Matlab 7. I've narrowed the main speed
problem down to the operation of finding the euclidean distance
between two matrices that share one dimension rank (dist in Matlab):
Python:
def
Hi Sebastian,
I am not sure if there is a function already defined in numpy, but
something like this may be what you are after
def distance(a1, a2):
return sqrt(sum((a1[:,newaxis,:] - a2[newaxis,:,:])**2, axis=2))
The general idea is to avoid loops if you want the code to execute
fast. I
Hi,
def dtest():
A = random( [4,2])
B = random( [1000,2])
# drawback: memory usage temporarily doubled
# solution see below
d = A[:, newaxis, :] - B[newaxis, :, :]
# written as 3 expressions for more clarity
d = sqrt((d**2).sum(axis=2))
return d
def
Sebastian Beca wrote:
Hi,
I'm working with NumPy/SciPy on some algorithms and i've run into some
important speed differences wrt Matlab 7. I've narrowed the main speed
problem down to the operation of finding the euclidean distance
between two matrices that share one dimension rank (dist in
Christopher Barker wrote:
Bruce Southey wrote:
Please run the exact same code in Matlab that you are running in
NumPy. Many of Matlab functions are very highly optimized so these are
provided as binary functions. I think that you are running into this
so you are not doing the correct
Thanks! Avoiding the inner loop is MUCH faster (~20-300 times than the
original). Nevertheless I don't think I can use hypot as it only works
for two dimensions. The general problem I have is:
A = random( [C, K] )
B = random( [N, K] )
C ~ 1-10
N ~ Large (thousands, millions.. i.e. my dataset)
K
Please replace:
C = 4
N = 1000
d = zeros([C, N], dtype=float)
BK.
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion
16 matches
Mail list logo