Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Ian Dees
On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast st...@asklater.com wrote:

  Interesting.

 How efficient is the (big)int indexing and/or masking?


I haven't had a chance to look at the integer indexing/masking. If I
remember it from discussions on dev a long while ago I think it's very close
to geohashes.



 Was this all on a single machine?


Yes.






 On 4/12/2011 1:52 PM, Ian Dees wrote:

 Yep.

 On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast st...@asklater.com wrote:

  and using the builtin spatial index?



 On 4/12/2011 1:50 PM, Ian Dees wrote:

 Yes, one document per node/way/relation.

 On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast st...@asklater.com wrote:

  how was the data put in the db though? 1 document per node?


 On 4/12/2011 1:39 PM, Nolan Darilek wrote:

 Oopse, meant for this to go to the whole list.



  Original Message   Subject: Re: [OSM-dev] OSM and
 MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
 no...@thewordnerd.info no...@thewordnerd.info  To: Ian Dees
 ian.d...@gmail.com ian.d...@gmail.com

 I had/am having a somewhat bad experience storing OSM data in MongoDB.

 Initially I stored all map data in MongoDB, but queries took ages. The
 same queries that happen in 100-200 MS now often took nearly a second.
 Additionally, some took upwards of 5, and I even found spots on my map
 sparsely populated with points, but which reliably performed the queries I
 need in 30+ seconds.

 I filed a thorough bug in their tracker, including a dataset and queries
 that reliably duplicated the issue. It was marked wontfix, I abandoned
 MongoDB, and it was apparently re-opened and fixed several months later. So
 perhaps it's a non-issue now.

 I'm still using MongoDB for part of my current project, user POI storage.
 It does indeed use geohashes, and I'm experiencing strange accuracy issues.
 My platform is pedestrian navigation with many small distance queries.
 Points in the non-MongoDB dataset are reliably detected in a radius roughly
 100 meters around the traveler. Points in MongoDB queried with the same
 bounding boxes don't appear until they're within 30-40 meters. I recently
 updated from an older version to a new build of 1.8. The older version
 widely varied the detection range. Some points were detected 100 or so
 meters out, while others weren't picked up until 30 or so. It was always the
 same points, too. The point for my apartment remains reliably visible for
 ~100 meters or so, while the corner store and restaurant didn't appear until
 I was very close. 1.8 at least appears to be consistent, always detecting at
 30 meters or so. I can only assume that this is a geohash oddity that only
 appears for very small differences, something that works out to rounding
 error for larger values.

 I like MongoDB for many things, but not for geospatial data more
 complicated than a series of points. I'm working on migrating user/POI
 storage to a geospatial store.


 On 04/12/2011 01:20 PM, Ian Dees wrote:

 Yep, and I think Mongo uses geohashes as their index behind the scenes.
 One of the problems with that, though, is they have some arbitrary length
 that they compute the geohash to and when you have lots of points (as OSM
 data does) the buckets they're searching are very full.

 On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast st...@asklater.com wrote:

  bbox queries using the built in spatial indexing presumably? OSM has
 it's own magical bitmask for that, that may also be as fast in mongo, who
 knows.


 On 4/11/2011 5:58 PM, Ian Dees wrote:

  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo 
 ser...@microsoft.comwrote:

  Hi,



 I am working on evaluation of MongoDB for several storage solutions at
 hand. Some of them resemble current OSM editing database. I have heard 
 that
 OSM dev is/was evaluating MongoDB also. I was wondering whether it 
 possible
 to share the findings?




  In my experimentation with MongoDB (seen here:
 https://github.com/iandees/mongosm/) I found it to be very slow.
 Inserts were speedy, but bounding-box queries took a long time.

  The most recent dev version of MongoDB includes multi-location
 documents support:

 http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

  This would allow a single way document to be indexed at multiple
 locations and vastly speed up the map query.


 ___
 dev mailing list
 dev@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev


 ___
 dev mailing list
 dev@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/dev



 ___
 dev mailing 
 listdev@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev



 ___
 dev mailing 
 listdev@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev


 ___
 dev mailing list
 

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Greg Studer
MongoDB does use a geohash as the indexing method for geo-searches, but
pretty sure that's not the cause of the huge query times.  The
geohashing tends to be very fast, but the way points were buffered for
return in pre-1.9 releases could in particular point distributions cause
these slowdowns - I'm guessing the neighboring boxes had many more
points.

Exact point checks and distances are also being introduced in 1.9, so
when/if the hash isn't precise enough to complete your search, you
shouldn't get these types of inaccurate results (the hash is currently
tunable to 32 bits of precision).  Of course, these are all new
developments (along with polygon searches and multi-location documents),
geo-indexing has gotten a lot of attention as of late.

disclaimer: as per my email address, I work at 10gen on MongoDB

On Wed, 2011-04-13 at 08:52 -0500, Ian Dees wrote: 
 
 
 On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast st...@asklater.com
 wrote:
 Interesting.
 
 How efficient is the (big)int indexing and/or masking?
 
 
 

 
 I haven't had a chance to look at the integer indexing/masking. If I
 remember it from discussions on dev a long while ago I think it's very
 close to geohashes.
  
 
 Was this all on a single machine? 
 
 
 Yes.
  
 
 
 
 
 
 On 4/12/2011 1:52 PM, Ian Dees wrote: 
  Yep.
  
  On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast
  st...@asklater.com wrote: 
  and using the builtin spatial index? 
  
  
  
  On 4/12/2011 1:50 PM, Ian Dees wrote: 
   Yes, one document per node/way/relation.
   
   On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast
   st...@asklater.com wrote: 
   how was the data put in the db though? 1
   document per node? 
   
   
   On 4/12/2011 1:39 PM, Nolan Darilek
   wrote: 
Oopse, meant for this to go to the whole
list.



 Original Message  
   Subject: 
Re: [OSM-dev] OSM
and MongoDB
  Date: 
Tue, 12 Apr 2011
15:26:41 -0500
  From: 
Nolan Darilek
no...@thewordnerd.info
To: 
Ian Dees
ian.d...@gmail.com


I had/am having a somewhat bad
experience storing OSM data in MongoDB.

Initially I stored all map data in
MongoDB, but queries took ages. The same
queries that happen in 100-200 MS now
often took nearly a second.
Additionally, some took upwards of 5,
and I even found spots on my map
sparsely populated with points, but
which reliably performed the queries I
need in 30+ seconds.

I filed a thorough bug in their tracker,
including a dataset and queries that
reliably duplicated the issue. It was
marked wontfix, I abandoned MongoDB, and
it was apparently re-opened and fixed
several months later. So perhaps it's a
non-issue now.

I'm still using MongoDB for part of my
current project, user POI storage. It
does indeed use geohashes, and I'm
experiencing strange accuracy issues. My
platform is pedestrian navigation with
many small distance queries. Points in
the non-MongoDB dataset are reliably
detected in a radius roughly 100 meters
around the traveler. Points in MongoDB
queried with the same bounding boxes
don't appear until they're within 30-40
meters. I 

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Andreas Scheucher
hi,

some weeks ago, i got interested in NoSQL datababase products. I had no
experience with them up to now, but as it was a requirement for an job, I
started to read about apache cassandra and thougth, this would be
interesting for openstreetmaps.

up to now my findings are only theoreticaly, but I would like to digg
deeper, when I find time.

But one think I wonder about is, you tested it on one machine. Isn't it like
that, you need several nodes and loads of data to really benefit from NoSQL
databases? At least this was my understanding of the whole thing...

greets,
Andreas

2011/4/13 Ian Dees ian.d...@gmail.com


 On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast st...@asklater.com wrote:

  Interesting.

 How efficient is the (big)int indexing and/or masking?


 I haven't had a chance to look at the integer indexing/masking. If I
 remember it from discussions on dev a long while ago I think it's very close
 to geohashes.



 Was this all on a single machine?


 Yes.






 On 4/12/2011 1:52 PM, Ian Dees wrote:

 Yep.

 On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast st...@asklater.com wrote:

  and using the builtin spatial index?



 On 4/12/2011 1:50 PM, Ian Dees wrote:

 Yes, one document per node/way/relation.

 On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast st...@asklater.com wrote:

  how was the data put in the db though? 1 document per node?


 On 4/12/2011 1:39 PM, Nolan Darilek wrote:

 Oopse, meant for this to go to the whole list.



  Original Message   Subject: Re: [OSM-dev] OSM and
 MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
 no...@thewordnerd.info no...@thewordnerd.info  To: Ian Dees
 ian.d...@gmail.com ian.d...@gmail.com

 I had/am having a somewhat bad experience storing OSM data in MongoDB.

 Initially I stored all map data in MongoDB, but queries took ages. The
 same queries that happen in 100-200 MS now often took nearly a second.
 Additionally, some took upwards of 5, and I even found spots on my map
 sparsely populated with points, but which reliably performed the queries I
 need in 30+ seconds.

 I filed a thorough bug in their tracker, including a dataset and queries
 that reliably duplicated the issue. It was marked wontfix, I abandoned
 MongoDB, and it was apparently re-opened and fixed several months later. So
 perhaps it's a non-issue now.

 I'm still using MongoDB for part of my current project, user POI
 storage. It does indeed use geohashes, and I'm experiencing strange 
 accuracy
 issues. My platform is pedestrian navigation with many small distance
 queries. Points in the non-MongoDB dataset are reliably detected in a 
 radius
 roughly 100 meters around the traveler. Points in MongoDB queried with the
 same bounding boxes don't appear until they're within 30-40 meters. I
 recently updated from an older version to a new build of 1.8. The older
 version widely varied the detection range. Some points were detected 100 or
 so meters out, while others weren't picked up until 30 or so. It was always
 the same points, too. The point for my apartment remains reliably visible
 for ~100 meters or so, while the corner store and restaurant didn't appear
 until I was very close. 1.8 at least appears to be consistent, always
 detecting at 30 meters or so. I can only assume that this is a geohash
 oddity that only appears for very small differences, something that works
 out to rounding error for larger values.

 I like MongoDB for many things, but not for geospatial data more
 complicated than a series of points. I'm working on migrating user/POI
 storage to a geospatial store.


 On 04/12/2011 01:20 PM, Ian Dees wrote:

 Yep, and I think Mongo uses geohashes as their index behind the scenes.
 One of the problems with that, though, is they have some arbitrary length
 that they compute the geohash to and when you have lots of points (as OSM
 data does) the buckets they're searching are very full.

 On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast st...@asklater.comwrote:

  bbox queries using the built in spatial indexing presumably? OSM has
 it's own magical bitmask for that, that may also be as fast in mongo, who
 knows.


 On 4/11/2011 5:58 PM, Ian Dees wrote:

  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo 
 ser...@microsoft.comwrote:

  Hi,



 I am working on evaluation of MongoDB for several storage solutions at
 hand. Some of them resemble current OSM editing database. I have heard 
 that
 OSM dev is/was evaluating MongoDB also. I was wondering whether it 
 possible
 to share the findings?




  In my experimentation with MongoDB (seen here:
 https://github.com/iandees/mongosm/) I found it to be very slow.
 Inserts were speedy, but bounding-box queries took a long time.

  The most recent dev version of MongoDB includes multi-location
 documents support:

 http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

  This would allow a single way document to be indexed at multiple
 locations and vastly 

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Ian Dees
On Wed, Apr 13, 2011 at 2:35 PM, Andreas Scheucher 
andreas.scheuc...@gmail.com wrote:

 hi,

 some weeks ago, i got interested in NoSQL datababase products. I had no
 experience with them up to now, but as it was a requirement for an job, I
 started to read about apache cassandra and thougth, this would be
 interesting for openstreetmaps.


Yep, Cassandra would be an interesting option to try. In fact many moons ago
I spoke with the folks at SimpleGeo about attempting to host some OSM data
there in their infrastructure. At the time they didn't support anything but
point features (and had no other way of dealing with metadata) so I haven't
pursued it.

Additionally, this talk they gave was quite informative and gave quite a bit
of information about how they store their location data in Cassandra:
http://www.youtube.com/watch?v=7J61pPG9j90


 up to now my findings are only theoreticaly, but I would like to digg
 deeper, when I find time.

 But one think I wonder about is, you tested it on one machine. Isn't it
 like that, you need several nodes and loads of data to really benefit from
 NoSQL databases? At least this was my understanding of the whole thing...


The purpose of multiple machines in this case is to have relatively reliable
storage and multiple copies of the data on different machines, not
necessarily an increase in read speed (Greg, maybe you could correct me?).
Last time I looked at MongoDB seriously for OSM I imported an entire planet,
so it was loads of data :). I have not tried a whole planet with the more
recent versions, though.



 greets,
 Andreas

 2011/4/13 Ian Dees ian.d...@gmail.com


 On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast st...@asklater.com wrote:

  Interesting.

 How efficient is the (big)int indexing and/or masking?


 I haven't had a chance to look at the integer indexing/masking. If I
 remember it from discussions on dev a long while ago I think it's very close
 to geohashes.



 Was this all on a single machine?


 Yes.






 On 4/12/2011 1:52 PM, Ian Dees wrote:

 Yep.

 On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast st...@asklater.com wrote:

  and using the builtin spatial index?



 On 4/12/2011 1:50 PM, Ian Dees wrote:

 Yes, one document per node/way/relation.

 On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast st...@asklater.comwrote:

  how was the data put in the db though? 1 document per node?


 On 4/12/2011 1:39 PM, Nolan Darilek wrote:

 Oopse, meant for this to go to the whole list.



  Original Message   Subject: Re: [OSM-dev] OSM and
 MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
 no...@thewordnerd.info no...@thewordnerd.info  To: Ian Dees
 ian.d...@gmail.com ian.d...@gmail.com

 I had/am having a somewhat bad experience storing OSM data in MongoDB.

 Initially I stored all map data in MongoDB, but queries took ages. The
 same queries that happen in 100-200 MS now often took nearly a second.
 Additionally, some took upwards of 5, and I even found spots on my map
 sparsely populated with points, but which reliably performed the queries I
 need in 30+ seconds.

 I filed a thorough bug in their tracker, including a dataset and
 queries that reliably duplicated the issue. It was marked wontfix, I
 abandoned MongoDB, and it was apparently re-opened and fixed several 
 months
 later. So perhaps it's a non-issue now.

 I'm still using MongoDB for part of my current project, user POI
 storage. It does indeed use geohashes, and I'm experiencing strange 
 accuracy
 issues. My platform is pedestrian navigation with many small distance
 queries. Points in the non-MongoDB dataset are reliably detected in a 
 radius
 roughly 100 meters around the traveler. Points in MongoDB queried with the
 same bounding boxes don't appear until they're within 30-40 meters. I
 recently updated from an older version to a new build of 1.8. The older
 version widely varied the detection range. Some points were detected 100 
 or
 so meters out, while others weren't picked up until 30 or so. It was 
 always
 the same points, too. The point for my apartment remains reliably visible
 for ~100 meters or so, while the corner store and restaurant didn't appear
 until I was very close. 1.8 at least appears to be consistent, always
 detecting at 30 meters or so. I can only assume that this is a geohash
 oddity that only appears for very small differences, something that works
 out to rounding error for larger values.

 I like MongoDB for many things, but not for geospatial data more
 complicated than a series of points. I'm working on migrating user/POI
 storage to a geospatial store.


 On 04/12/2011 01:20 PM, Ian Dees wrote:

 Yep, and I think Mongo uses geohashes as their index behind the scenes.
 One of the problems with that, though, is they have some arbitrary length
 that they compute the geohash to and when you have lots of points (as OSM
 data does) the buckets they're searching are very full.

 On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast 

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Greg Studer
Agree, think the issue in this case definitely wasn't related to
multiple machines.  In general, though, you often can do much better
performance-wise on large data sets by running queries on data subsets
across multiple systems, whatever software you use.  Most NoSQL dbs try
to make this particularly easy.

On Wed, 2011-04-13 at 14:44 -0500, Ian Dees wrote: 
 On Wed, Apr 13, 2011 at 2:35 PM, Andreas Scheucher
 andreas.scheuc...@gmail.com wrote:
 hi, 
 
 
 some weeks ago, i got interested in NoSQL datababase products.
 I had no experience with them up to now, but as it was a
 requirement for an job, I started to read about apache
 cassandra and thougth, this would be interesting for
 openstreetmaps. 
 
 
 
 
 Yep, Cassandra would be an interesting option to try. In fact many
 moons ago I spoke with the folks at SimpleGeo about attempting to host
 some OSM data there in their infrastructure. At the time they didn't
 support anything but point features (and had no other way of dealing
 with metadata) so I haven't pursued it.
 
 
 Additionally, this talk they gave was quite informative and gave quite
 a bit of information about how they store their location data in
 Cassandra: http://www.youtube.com/watch?v=7J61pPG9j90
  
 
 up to now my findings are only theoreticaly, but I would like
 to digg deeper, when I find time. 
 
 
 But one think I wonder about is, you tested it on one machine.
 Isn't it like that, you need several nodes and loads of data
 to really benefit from NoSQL databases? At least this was my
 understanding of the whole thing... 
 
 
 The purpose of multiple machines in this case is to have relatively
 reliable storage and multiple copies of the data on different
 machines, not necessarily an increase in read speed (Greg, maybe you
 could correct me?). Last time I looked at MongoDB seriously for OSM I
 imported an entire planet, so it was loads of data :). I have not
 tried a whole planet with the more recent versions, though.
  
 
 
 greets, 
 Andreas 
 
 
 2011/4/13 Ian Dees ian.d...@gmail.com 
 
 
 On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast
 st...@asklater.com wrote: 
 Interesting.
 
 How efficient is the (big)int indexing and/or
 masking?
 
 
 
 I haven't had a chance to look at the integer
 indexing/masking. If I remember it from discussions on
 dev a long while ago I think it's very close to
 geohashes. 
   
 
 Was this all on a single machine? 
 
 
 Yes. 
 
   
 
 
 
 
 
 On 4/12/2011 1:52 PM, Ian Dees wrote: 
  Yep.
  
  On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast
  st...@asklater.com wrote: 
  and using the builtin spatial
  index? 
  
  
  
  On 4/12/2011 1:50 PM, Ian Dees
  wrote: 
   Yes, one document per
   node/way/relation.
   
   On Tue, Apr 12, 2011 at 3:47 PM,
   Steve Coast st...@asklater.com
   wrote: 
   how was the data put in
   the db though? 1 document
   per node? 
   
   
   On 4/12/2011 1:39 PM,
   Nolan Darilek wrote: 
Oopse, meant for this to
go to the whole list.



 Original
Message  
   Subject: 
Re:
   

[OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Nolan Darilek

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek no...@thewordnerd.info
To: Ian Dees ian.d...@gmail.com



I had/am having a somewhat bad experience storing OSM data in MongoDB.

Initially I stored all map data in MongoDB, but queries took ages. The 
same queries that happen in 100-200 MS now often took nearly a second. 
Additionally, some took upwards of 5, and I even found spots on my map 
sparsely populated with points, but which reliably performed the queries 
I need in 30+ seconds.


I filed a thorough bug in their tracker, including a dataset and queries 
that reliably duplicated the issue. It was marked wontfix, I abandoned 
MongoDB, and it was apparently re-opened and fixed several months later. 
So perhaps it's a non-issue now.


I'm still using MongoDB for part of my current project, user POI 
storage. It does indeed use geohashes, and I'm experiencing strange 
accuracy issues. My platform is pedestrian navigation with many small 
distance queries. Points in the non-MongoDB dataset are reliably 
detected in a radius roughly 100 meters around the traveler. Points in 
MongoDB queried with the same bounding boxes don't appear until they're 
within 30-40 meters. I recently updated from an older version to a new 
build of 1.8. The older version widely varied the detection range. Some 
points were detected 100 or so meters out, while others weren't picked 
up until 30 or so. It was always the same points, too. The point for my 
apartment remains reliably visible for ~100 meters or so, while the 
corner store and restaurant didn't appear until I was very close. 1.8 at 
least appears to be consistent, always detecting at 30 meters or so. I 
can only assume that this is a geohash oddity that only appears for very 
small differences, something that works out to rounding error for larger 
values.


I like MongoDB for many things, but not for geospatial data more 
complicated than a series of points. I'm working on migrating user/POI 
storage to a geospatial store.



On 04/12/2011 01:20 PM, Ian Dees wrote:
Yep, and I think Mongo uses geohashes as their index behind the 
scenes. One of the problems with that, though, is they have some 
arbitrary length that they compute the geohash to and when you have 
lots of points (as OSM data does) the buckets they're searching are 
very full.


On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast st...@asklater.com 
mailto:st...@asklater.com wrote:


bbox queries using the built in spatial indexing presumably? OSM
has it's own magical bitmask for that, that may also be as fast in
mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
ser...@microsoft.com mailto:ser...@microsoft.com wrote:

Hi,

I am working on evaluation of MongoDB for several storage
solutions at hand. Some of them resemble current OSM editing
database. I have heard that OSM dev is/was evaluating MongoDB
also. I was wondering whether it possible to share the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be very slow.
Inserts were speedy, but bounding-box queries took a long time.

The most recent dev version of MongoDB includes multi-location
documents support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed at multiple
locations and vastly speed up the map query.


___
dev mailing list
dev@openstreetmap.org mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Steve Coast

how was the data put in the db though? 1 document per node?

On 4/12/2011 1:39 PM, Nolan Darilek wrote:

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek no...@thewordnerd.info
To: Ian Dees ian.d...@gmail.com



I had/am having a somewhat bad experience storing OSM data in MongoDB.

Initially I stored all map data in MongoDB, but queries took ages. The 
same queries that happen in 100-200 MS now often took nearly a second. 
Additionally, some took upwards of 5, and I even found spots on my map 
sparsely populated with points, but which reliably performed the 
queries I need in 30+ seconds.


I filed a thorough bug in their tracker, including a dataset and 
queries that reliably duplicated the issue. It was marked wontfix, I 
abandoned MongoDB, and it was apparently re-opened and fixed several 
months later. So perhaps it's a non-issue now.


I'm still using MongoDB for part of my current project, user POI 
storage. It does indeed use geohashes, and I'm experiencing strange 
accuracy issues. My platform is pedestrian navigation with many small 
distance queries. Points in the non-MongoDB dataset are reliably 
detected in a radius roughly 100 meters around the traveler. Points in 
MongoDB queried with the same bounding boxes don't appear until 
they're within 30-40 meters. I recently updated from an older version 
to a new build of 1.8. The older version widely varied the detection 
range. Some points were detected 100 or so meters out, while others 
weren't picked up until 30 or so. It was always the same points, too. 
The point for my apartment remains reliably visible for ~100 meters or 
so, while the corner store and restaurant didn't appear until I was 
very close. 1.8 at least appears to be consistent, always detecting at 
30 meters or so. I can only assume that this is a geohash oddity that 
only appears for very small differences, something that works out to 
rounding error for larger values.


I like MongoDB for many things, but not for geospatial data more 
complicated than a series of points. I'm working on migrating user/POI 
storage to a geospatial store.



On 04/12/2011 01:20 PM, Ian Dees wrote:
Yep, and I think Mongo uses geohashes as their index behind the 
scenes. One of the problems with that, though, is they have some 
arbitrary length that they compute the geohash to and when you have 
lots of points (as OSM data does) the buckets they're searching are 
very full.


On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast st...@asklater.com 
mailto:st...@asklater.com wrote:


bbox queries using the built in spatial indexing presumably? OSM
has it's own magical bitmask for that, that may also be as fast
in mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
ser...@microsoft.com mailto:ser...@microsoft.com wrote:

Hi,

I am working on evaluation of MongoDB for several storage
solutions at hand. Some of them resemble current OSM editing
database. I have heard that OSM dev is/was evaluating
MongoDB also. I was wondering whether it possible to share
the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be very slow.
Inserts were speedy, but bounding-box queries took a long time.

The most recent dev version of MongoDB includes multi-location
documents support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed at multiple
locations and vastly speed up the map query.


___
dev mailing list
dev@openstreetmap.org mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Ian Dees
Yes, one document per node/way/relation.

On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast st...@asklater.com wrote:

  how was the data put in the db though? 1 document per node?


 On 4/12/2011 1:39 PM, Nolan Darilek wrote:

 Oopse, meant for this to go to the whole list.



  Original Message   Subject: Re: [OSM-dev] OSM and MongoDB  
 Date:
 Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
 no...@thewordnerd.info no...@thewordnerd.info  To: Ian Dees
 ian.d...@gmail.com ian.d...@gmail.com

 I had/am having a somewhat bad experience storing OSM data in MongoDB.

 Initially I stored all map data in MongoDB, but queries took ages. The same
 queries that happen in 100-200 MS now often took nearly a second.
 Additionally, some took upwards of 5, and I even found spots on my map
 sparsely populated with points, but which reliably performed the queries I
 need in 30+ seconds.

 I filed a thorough bug in their tracker, including a dataset and queries
 that reliably duplicated the issue. It was marked wontfix, I abandoned
 MongoDB, and it was apparently re-opened and fixed several months later. So
 perhaps it's a non-issue now.

 I'm still using MongoDB for part of my current project, user POI storage.
 It does indeed use geohashes, and I'm experiencing strange accuracy issues.
 My platform is pedestrian navigation with many small distance queries.
 Points in the non-MongoDB dataset are reliably detected in a radius roughly
 100 meters around the traveler. Points in MongoDB queried with the same
 bounding boxes don't appear until they're within 30-40 meters. I recently
 updated from an older version to a new build of 1.8. The older version
 widely varied the detection range. Some points were detected 100 or so
 meters out, while others weren't picked up until 30 or so. It was always the
 same points, too. The point for my apartment remains reliably visible for
 ~100 meters or so, while the corner store and restaurant didn't appear until
 I was very close. 1.8 at least appears to be consistent, always detecting at
 30 meters or so. I can only assume that this is a geohash oddity that only
 appears for very small differences, something that works out to rounding
 error for larger values.

 I like MongoDB for many things, but not for geospatial data more
 complicated than a series of points. I'm working on migrating user/POI
 storage to a geospatial store.


 On 04/12/2011 01:20 PM, Ian Dees wrote:

 Yep, and I think Mongo uses geohashes as their index behind the scenes. One
 of the problems with that, though, is they have some arbitrary length that
 they compute the geohash to and when you have lots of points (as OSM data
 does) the buckets they're searching are very full.

 On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast st...@asklater.com wrote:

  bbox queries using the built in spatial indexing presumably? OSM has it's
 own magical bitmask for that, that may also be as fast in mongo, who knows.


 On 4/11/2011 5:58 PM, Ian Dees wrote:

  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo ser...@microsoft.comwrote:

  Hi,



 I am working on evaluation of MongoDB for several storage solutions at
 hand. Some of them resemble current OSM editing database. I have heard that
 OSM dev is/was evaluating MongoDB also. I was wondering whether it possible
 to share the findings?




  In my experimentation with MongoDB (seen here:
 https://github.com/iandees/mongosm/) I found it to be very slow. Inserts
 were speedy, but bounding-box queries took a long time.

  The most recent dev version of MongoDB includes multi-location
 documents support:

 http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

  This would allow a single way document to be indexed at multiple
 locations and vastly speed up the map query.


 ___
 dev mailing list
 dev@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev


 ___
 dev mailing list
 dev@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/dev



 ___
 dev mailing 
 listdev@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev



 ___
 dev mailing 
 listdev@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev


 ___
 dev mailing list
 dev@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Steve Coast

and using the builtin spatial index?


On 4/12/2011 1:50 PM, Ian Dees wrote:

Yes, one document per node/way/relation.

On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast st...@asklater.com 
mailto:st...@asklater.com wrote:


how was the data put in the db though? 1 document per node?


On 4/12/2011 1:39 PM, Nolan Darilek wrote:

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek no...@thewordnerd.info
mailto:no...@thewordnerd.info
To: Ian Dees ian.d...@gmail.com mailto:ian.d...@gmail.com



I had/am having a somewhat bad experience storing OSM data in
MongoDB.

Initially I stored all map data in MongoDB, but queries took
ages. The same queries that happen in 100-200 MS now often took
nearly a second. Additionally, some took upwards of 5, and I even
found spots on my map sparsely populated with points, but which
reliably performed the queries I need in 30+ seconds.

I filed a thorough bug in their tracker, including a dataset and
queries that reliably duplicated the issue. It was marked
wontfix, I abandoned MongoDB, and it was apparently re-opened and
fixed several months later. So perhaps it's a non-issue now.

I'm still using MongoDB for part of my current project, user POI
storage. It does indeed use geohashes, and I'm experiencing
strange accuracy issues. My platform is pedestrian navigation
with many small distance queries. Points in the non-MongoDB
dataset are reliably detected in a radius roughly 100 meters
around the traveler. Points in MongoDB queried with the same
bounding boxes don't appear until they're within 30-40 meters. I
recently updated from an older version to a new build of 1.8. The
older version widely varied the detection range. Some points were
detected 100 or so meters out, while others weren't picked up
until 30 or so. It was always the same points, too. The point for
my apartment remains reliably visible for ~100 meters or so,
while the corner store and restaurant didn't appear until I was
very close. 1.8 at least appears to be consistent, always
detecting at 30 meters or so. I can only assume that this is a
geohash oddity that only appears for very small differences,
something that works out to rounding error for larger values.

I like MongoDB for many things, but not for geospatial data more
complicated than a series of points. I'm working on migrating
user/POI storage to a geospatial store.


On 04/12/2011 01:20 PM, Ian Dees wrote:

Yep, and I think Mongo uses geohashes as their index behind the
scenes. One of the problems with that, though, is they have some
arbitrary length that they compute the geohash to and when you
have lots of points (as OSM data does) the buckets they're
searching are very full.

On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast st...@asklater.com
mailto:st...@asklater.com wrote:

bbox queries using the built in spatial indexing presumably?
OSM has it's own magical bitmask for that, that may also be
as fast in mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
ser...@microsoft.com mailto:ser...@microsoft.com wrote:

Hi,

I am working on evaluation of MongoDB for several
storage solutions at hand. Some of them resemble
current OSM editing database. I have heard that OSM dev
is/was evaluating MongoDB also. I was wondering whether
it possible to share the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be very
slow. Inserts were speedy, but bounding-box queries took a
long time.

The most recent dev version of MongoDB includes
multi-location documents support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed at
multiple locations and vastly speed up the map query.


___
dev mailing list
dev@openstreetmap.org mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
dev@openstreetmap.org  mailto:dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
dev@openstreetmap.org 

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Nolan Darilek

On 04/12/2011 03:47 PM, Steve Coast wrote:

how was the data put in the db though? 1 document per node?




Yes, with deeper structures for ways and relations.
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev