On Fri, Jul 16, 2010 at 2:34 PM, Nolan Darilek <[email protected]>wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hey, folks. I suppose that I might be wrong about this, and if so then > I'd love to be, but I thought that I'd share my recent findings here > either to inspire work in different directions if possible, or to close > a possibly useless avenue of work if not. > > I have a real-time navigation app that stores its data in MongoDB, using > its geospatial queries. During the course of my work, I routinely found > that some queries were reliably slow while others were reliably fast. > We're talking differences of seconds, some taking 5, some 30, while > others were completed in under a second. Naturally, this is unsuitable > for an app that needs to provide near real-time feedback. I opened an > issue here, including my dump of an import of Texas' OSM nodes: > > http://jira.mongodb.org/browse/SERVER-1392 > > It seems that I'm running up against limitations in MongoDB's > geohash-based mechanism. It's probably perfectly suitable for most > average geospatial-based searches, but not so much for the case of > rendering OSM data in reliably short bursts of time. The issue has been > marked wontfix. > > I'm open to the possibility that I'm missing something, but have long > suspected that the geospatial support wasn't up to something of this > magnitude. I suppose that it might work as a data storage and > replication system, but if you need to get back data quickly then > MongoDB likely isn't a good fit. > > Anyhow, I thought that I'd share, especially as some of us were > discussing use of MongoDB here a few weeks back. > > I was seeing quite the opposite results with smallish (city-sized) bounding boxes: I was getting very fast responses (much faster than a second or two). I was definitely running into limitations of the Python serializer/deserializer before I was running into limitations of Mongo. This was after inserting most of an entire planet dump. However, it does seem like his explanation is valid: geohashing creates buckets and when those buckets are too big they fill up and make for slow queries. The nice thing about geohashing is that you can have arbitrarily-sized buckets, so I always assumed that they were picking the size based on how many points they saw. Maybe not.
_______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

