I've not used geohash much but thinking aloud... I think the final stage of geohash is base 32 encoded, so would it not partition naturally? If you leave it in the non encoded form then highly dense data I believe would be reflected in dense regions
What are your search requirements? If you are doing a lot of buffered point search (all records within 0.01 degree of x/y) then I would suspect (bot don't know for sure) that you would benefit from data locality during server side scanning if they were together in the region; arguing against the spatial partitioning strategy. (Completely off topic: I do a lot with google density map overlays at the moment, and have a Map Reduce portable version of http://204.236.250.31/map.html if you are interested. Can do density by pixel, 2x2 pixels, 4x4 pixels etc) Cheers, Tim On Tue, Apr 13, 2010 at 12:18 AM, Wade Arnold <wade.arn...@t8webware.com> wrote: > We have been working on using Hbase for some geospatial queries for > agronomic data. Via mapreduce we have created a secondary index to point at > the raw records. Our issue is that the density of geohash/UTM/Zip/(lat,long) > data sets is that they are naturally dense. For our use case the Midwest is > very dense and New York and San Francisco don¹t exist. I am sure for 4sqr > and localized advertising engines this is the opposite. Do to the density of > they key we keep on having region server density issues. I was wondering if > anyone on the list has added any additional dimension on top of a geohash in > order to create better partitioning? > > > Wade Arnold > >