geohash and the tier systems (local lucene) each have their place.

Geohash is attractive since it simple and could slip into lucene easily. The tier system is more complex, but supports more accurate calculations and better behavior around the "edges" (even in New Zealand and London)

I hope the spatial contrib will explore many approaches. Obviously not every approach will be generally applicable, but it is good to have in the toolbox.

Also check:
http://wiki.apache.org/lucene-java/SpatialSearch

ryan


On Dec 29, 2008, at 2:35 PM, Robert Muir wrote:

guys figured i would pass this along:
http://www.geospatialsemanticweb.com/2008/05/29/geohash-for-spatial-index-and-search

one comment there makes me a little afraid to use geohash for spatial search: That doesn't work too well for London, which straddles 0 longitude– either side of 0 flips the MSB. These two places are pretty close to each other:

http://geohash.org/u10hb7951
http://geohash.org/gcpuzewfz



On Mon, Dec 29, 2008 at 12:34 PM, patrick o'leary <polear...@aol.com> wrote:
Hey Marc

LocalLucene has been rewritten since then to use a Cartesian grid for it's boundary box look ups
http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html

GeoHash is method of consistent hashing to produce an id where the length of the id gives way to the precision of the point, as in 123ab6789 might be (42.12345, -73.12345)
and 123ab would be (42.12, -73.12)

It's a great way to store individual points or areas in a compressed format, kind of like a tiny url to a particular point on the globe.

Locallucene works differently by placing points within boxes at different zoom levels.
At minimum zoom level 0 (_localTier0) everything exists within 1 box,
zoom level 1it's 4 boxes
zoom level 2 it's 16 boxes
.....
zoom level 15 it's 1,073,741,824 boxes

Obviously the index will only contain box id's for the boxes that have points inside them (thus if your indexing only the land mass of the planet, your only going to use at most 30% of those boxes)

Based on the radius of your search, locallucene will select the appropriate zoom level to find your results in.

So locallucene can benefit from changing our notation for box id's to something similar to geohash to reduce index size, the concept for search is different. A couple of us are looking at including geohash into the locallucene code base, it would make our distance calculation less memory intensive having to load only one field cache for a point rather than the current 2 lat & long fields we use, but I have to test the decoding speed to see if it slows us down.

GeoHash's main benefit comes in the form of lookup by id, say for an image or tile map at a point or for geocoding. It probably has more benefits than that, and I'm sure someone will correct me on that.

I should also warn you, that I'm the guy who wrote locallucene so I have a natural bias towards it, but I'll be honest this is how I see
most geo searches working.

- P


squaro wrote:

Hello everybody

I would like to have your mind about spatial search techniques using Lucene

According to you is it better to use
http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene.htm
LocalLucene or encoding lat and long with http://geohash.org/ Geohash (
and then use a RangeFilter between the two boundaries hash) ?

In my mind I think using geohash should be better because the comparaison is
done on one field only.

What is your opinion about it ?

Best regards

Marc


--
Patrick O'Leary

AOL Local Search Technologies
Phone: + 1 703 265 8763

You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles.
 Do you understand this?
And radio operates exactly the same way: you send signals here, they receive them there. The only difference is that there is no cat.
  - Albert Einstein
<btn_in_20x15.gif>View Patrick O Leary's profile



--
Robert Muir
rcm...@gmail.com

Reply via email to