Hi,
A kth nearest neighbour cluster detection method is implemented in a cluster
detection tool that uses GeoTools:
https://code.google.com/p/spatial-cluster-detection/
I like the idea of using interleaved x and y coordinates (a form of geohash) to
do the indexing. When I've thought of this before, I considered the need for a
rectangular bounding box to interleave in 2 ways (for 2Dimensional data
defined in reference to X and Y axes), so for coordinates x and y where x =
x0x1x2 and y = y0y1y2 we can have two indexes A and B:
1. ax0y0x1y1x2y2
2. by0x0y1x1y2x2
The indexes can provide a subset for looking which can be processed to look for
a number of nearest neighbours. This approach will also work in 3 dimensions,
but then I think there needs to be 6 different indexes as follows:
zxy zyx xzy yzx yxz xyz
The great thing about the geohashes is that they truncate well. So you can
always look for things in a less detailed range.
Sorry if that is not very clear and that I don't have time to look at geophile.
HTH
Andy
http://www.geog.leeds.ac.uk/people/a.turner/index.html
From: Jack Orenstein [mailto:j...@geophile.com]
Sent: 17 July 2014 15:27
To: Brett Antonides
Cc: geotools-gt2-users@lists.sourceforge.net
Subject: Re: [Geotools-gt2-users] K-Nearest Neighbors Question
I work on spatial indexing and can suggest an idea. (I don't know enough about
GeoTools to know whether it provides a solution.)
I work on z-order based spatial indexes. Suppose you have a space with integer
coordinates, a set of points, (x, y), and you want to find the k points nearest
to a given point (x0, y0). This can be done using a z-order index as follows:
1. For each point, (x, y), compute the z-value, z, by interleaving the bits of
x and y.
2. Load the (z, x, y) data into an ordered index, an in-memory sorted array if
possible, or a b-tree on disk for a large data set.
3. Compute the z-value of your query point, z0, by interleaving the bits of x0
and y0.
4. Do a random access into your index (from step 2), using z0, and find the k
nearest records by z. These will all be adjacent in the index.
5. For each retrieved (z, x, y) record, compute the distance from (x, y) to
(x0, y0). The maximum of these distances is an upper bound on the radius in
which the k nearest neighbors to (x0, y0) can be found.
6. Search the spatial index using a circle of this max radius, centered at (x0,
y0).
7. Sort by distance and keep the top k.
You can use the index from step 2 for each point in your second data set.
(I think this algorithm is by David Abel, but I don't have the reference handy.)
The generalization to finding the k nearest non-point neighbors is pretty easy.
I have a few open source packages implementing z-order spatial indexes:
- Java: https://github.com/geophile/geophile
- C++: https://github.com/geophile/geophile.c
- SQL: https://github.com/geophile/spacesuit
These don't have k nearest neighbors built in, but it would be a simple
addition.
Jack Orenstein
On Jul 17, 2014, at 8:27 AM, Brett Antonides
<brett.antoni...@lmnsolutions.com<mailto:brett.antoni...@lmnsolutions.com>>
wrote:
Hi All,
I've been looking through the GeoTools documentation and search, but haven't
been able to find much on a k-NearestNeighbor algorithm. I'm really hoping
someone can point me in the right direction.
What I need to do is find the k nearest features in a feature collection to a
point. For instance, find the 5 closest features in a shapefile to a single
point. Even better would be a way to find the 5 closest features in a
shapefile to each point in a different shapefile. Is there anything in
GeoTools that can help do this? It would need to work Point-Point, Point-Line,
and Point-Polygon. I'm hoping there is something already and that it leverages
a spatial index to make the query faster since I have to run this operation a
large number of times.
Thanks very much for you help! Hope all is well.
Cheers,
Brett
Sorry if this got posted twice
Brett Antonides
703-673-9526 ext 760
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds_______________________________________________
GeoTools-GT2-Users mailing list
GeoTools-GT2-Users@lists.sourceforge.net<mailto:GeoTools-GT2-Users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
GeoTools-GT2-Users mailing list
GeoTools-GT2-Users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users