On Tue, Nov 10, 2009 at 7:48 PM, Anne Archibald <peridot.face...@gmail.com> wrote: > 2009/11/10 Christopher Barker <chris.bar...@noaa.gov>: >> Hi all, >> >> I have a bunch of points in 2-d space, and I need to find out which >> pairs of points are within a certain distance of one-another (regular >> old Euclidean norm). > > This is an eminently reasonable thing to want, and KDTree should > support it. Unfortunately it doesn't. > >> scipy.spatial.KDTree.query_ball_tree() seems like it's built for this. >> >> However, I'm a bit confused. The first argument is a kdtree, but I'm >> calling it as a method of a kdtree -- I want to know which points in the >> tree I already have are closer that some r from each-other. >> >> If I call it as: >> >> tree.query_ball_tree(tree, r) >> >> I get a big list, that has all the points in it (some of them paired up >> with close neighbors.) It appears I'm getting the distances between all >> the points in the tree and itself, as though they were different trees. >> >> This is slow, takes a bunch of memory, and I then have to parse out the >> list to find the ones that are paired up. >> >> Is there a way to get just the close ones from the single tree?
I used sparse_distance_matrix for the distance of a kdtree to itself in the past. Since it's a matrix it should be possible to just get the lower or upper triangle, and it's sparse so memory is not so much of a problem. But I remember it was also slow and only worth using if the matrix is large. Josef > > Unfortunately not at the moment. > > It's slow both because you're using the python implementation rather > than the C, and because you're getting all "pairs" where "pair" > includes pairing a point with itself (and also each distinct pair in > both orders). The tree really should allow self-queries that don't > return the point and itself. > > The one good thing about using the python implementation rather than > the Cython one is that you can subclass it, providing a new method. > There's still a certain amount of boilerplate code to write, but it > shouldn't be too bad. > > If this is still too slow, I have no problem incorporating additional > code into cKDTree; the only reason none of the ball queries are in > there is because ball queries must return variable-size results, so > you lose a certain amount of speed because you're forced to manipulate > python objects. But if there are relatively few result points, this > need not be much of a slowdown. > > Anne > >> thanks, >> >> -Chris >> >> >> >> >> >> >> -- >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> chris.bar...@noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion