Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Greg Studer
Agree, think the issue in this case definitely wasn't related to
multiple machines.  In general, though, you often can do much better
performance-wise on large data sets by running queries on data subsets
across multiple systems, whatever software you use.  Most NoSQL dbs try
to make this particularly easy.

On Wed, 2011-04-13 at 14:44 -0500, Ian Dees wrote: 
> On Wed, Apr 13, 2011 at 2:35 PM, Andreas Scheucher
>  wrote:
> hi, 
> 
> 
> some weeks ago, i got interested in NoSQL datababase products.
> I had no experience with them up to now, but as it was a
> requirement for an job, I started to read about apache
> cassandra and thougth, this would be interesting for
> openstreetmaps. 
> 
> 
> 
> 
> Yep, Cassandra would be an interesting option to try. In fact many
> moons ago I spoke with the folks at SimpleGeo about attempting to host
> some OSM data there in their infrastructure. At the time they didn't
> support anything but point features (and had no other way of dealing
> with metadata) so I haven't pursued it.
> 
> 
> Additionally, this talk they gave was quite informative and gave quite
> a bit of information about how they store their location data in
> Cassandra: http://www.youtube.com/watch?v=7J61pPG9j90
>  
> 
> up to now my findings are only theoreticaly, but I would like
> to digg deeper, when I find time. 
> 
> 
> But one think I wonder about is, you tested it on one machine.
> Isn't it like that, you need several nodes and loads of data
> to really benefit from NoSQL databases? At least this was my
> understanding of the whole thing... 
> 
> 
> The purpose of multiple machines in this case is to have relatively
> reliable storage and multiple copies of the data on different
> machines, not necessarily an increase in read speed (Greg, maybe you
> could correct me?). Last time I looked at MongoDB seriously for OSM I
> imported an entire planet, so it was "loads of data" :). I have not
> tried a whole planet with the more recent versions, though.
>  
> 
> 
> greets, 
> Andreas 
> 
> 
> 2011/4/13 Ian Dees  
> 
> 
> On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast
>  wrote: 
> Interesting.
> 
> How efficient is the (big)int indexing and/or
> masking?
> 
> 
> 
> I haven't had a chance to look at the integer
> indexing/masking. If I remember it from discussions on
> dev a long while ago I think it's very close to
> geohashes. 
>   
> 
> Was this all on a single machine? 
> 
> 
> Yes. 
> 
>   
> 
> 
> 
> 
> 
> On 4/12/2011 1:52 PM, Ian Dees wrote: 
> > Yep.
> > 
> > On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast
> >  wrote: 
> > and using the builtin spatial
> > index? 
> > 
> > 
> > 
> > On 4/12/2011 1:50 PM, Ian Dees
> > wrote: 
> > > Yes, one document per
> > > node/way/relation.
> > > 
> > > On Tue, Apr 12, 2011 at 3:47 PM,
> > > Steve Coast 
> > > wrote: 
> > > how was the data put in
> > > the db though? 1 document
> > > per node? 
> > > 
> > > 
> > > On 4/12/2011 1:39 PM,
> > > Nolan Darilek wrote: 
> > > > Oopse, meant for this to
> > > > go to the whole list.
> > > > 
> > > > 
> > > > 
> > > >  Original
> > > > Message  
> > > >  

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Ian Dees
On Wed, Apr 13, 2011 at 2:35 PM, Andreas Scheucher <
[email protected]> wrote:

> hi,
>
> some weeks ago, i got interested in NoSQL datababase products. I had no
> experience with them up to now, but as it was a requirement for an job, I
> started to read about apache cassandra and thougth, this would be
> interesting for openstreetmaps.
>
>
Yep, Cassandra would be an interesting option to try. In fact many moons ago
I spoke with the folks at SimpleGeo about attempting to host some OSM data
there in their infrastructure. At the time they didn't support anything but
point features (and had no other way of dealing with metadata) so I haven't
pursued it.

Additionally, this talk they gave was quite informative and gave quite a bit
of information about how they store their location data in Cassandra:
http://www.youtube.com/watch?v=7J61pPG9j90


> up to now my findings are only theoreticaly, but I would like to digg
> deeper, when I find time.
>
> But one think I wonder about is, you tested it on one machine. Isn't it
> like that, you need several nodes and loads of data to really benefit from
> NoSQL databases? At least this was my understanding of the whole thing...
>

The purpose of multiple machines in this case is to have relatively reliable
storage and multiple copies of the data on different machines, not
necessarily an increase in read speed (Greg, maybe you could correct me?).
Last time I looked at MongoDB seriously for OSM I imported an entire planet,
so it was "loads of data" :). I have not tried a whole planet with the more
recent versions, though.


>
> greets,
> Andreas
>
> 2011/4/13 Ian Dees 
>
>>
>> On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast  wrote:
>>
>>>  Interesting.
>>>
>>> How efficient is the (big)int indexing and/or masking?
>>>
>>
>> I haven't had a chance to look at the integer indexing/masking. If I
>> remember it from discussions on dev a long while ago I think it's very close
>> to geohashes.
>>
>>
>>>
>>> Was this all on a single machine?
>>>
>>
>> Yes.
>>
>>
>>>
>>>
>>>
>>>
>>> On 4/12/2011 1:52 PM, Ian Dees wrote:
>>>
>>> Yep.
>>>
>>> On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast  wrote:
>>>
  and using the builtin spatial index?



 On 4/12/2011 1:50 PM, Ian Dees wrote:

 Yes, one document per node/way/relation.

 On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast wrote:

>  how was the data put in the db though? 1 document per node?
>
>
> On 4/12/2011 1:39 PM, Nolan Darilek wrote:
>
> Oopse, meant for this to go to the whole list.
>
>
>
>  Original Message   Subject: Re: [OSM-dev] OSM and
> MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
>To: Ian Dees
>  
>
> I had/am having a somewhat bad experience storing OSM data in MongoDB.
>
> Initially I stored all map data in MongoDB, but queries took ages. The
> same queries that happen in 100-200 MS now often took nearly a second.
> Additionally, some took upwards of 5, and I even found spots on my map
> sparsely populated with points, but which reliably performed the queries I
> need in 30+ seconds.
>
> I filed a thorough bug in their tracker, including a dataset and
> queries that reliably duplicated the issue. It was marked wontfix, I
> abandoned MongoDB, and it was apparently re-opened and fixed several 
> months
> later. So perhaps it's a non-issue now.
>
> I'm still using MongoDB for part of my current project, user POI
> storage. It does indeed use geohashes, and I'm experiencing strange 
> accuracy
> issues. My platform is pedestrian navigation with many small distance
> queries. Points in the non-MongoDB dataset are reliably detected in a 
> radius
> roughly 100 meters around the traveler. Points in MongoDB queried with the
> same bounding boxes don't appear until they're within 30-40 meters. I
> recently updated from an older version to a new build of 1.8. The older
> version widely varied the detection range. Some points were detected 100 
> or
> so meters out, while others weren't picked up until 30 or so. It was 
> always
> the same points, too. The point for my apartment remains reliably visible
> for ~100 meters or so, while the corner store and restaurant didn't appear
> until I was very close. 1.8 at least appears to be consistent, always
> detecting at 30 meters or so. I can only assume that this is a geohash
> oddity that only appears for very small differences, something that works
> out to rounding error for larger values.
>
> I like MongoDB for many things, but not for geospatial data more
> complicated than a series of points. I'm working on migrating user/POI
> storage to a geospatial store.
>
>
> On 04/12/2011 01:20 PM, Ian Dees wrote:
>
> Yep, and I think Mongo uses geohashes as their index behind the scene

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Andreas Scheucher
hi,

some weeks ago, i got interested in NoSQL datababase products. I had no
experience with them up to now, but as it was a requirement for an job, I
started to read about apache cassandra and thougth, this would be
interesting for openstreetmaps.

up to now my findings are only theoreticaly, but I would like to digg
deeper, when I find time.

But one think I wonder about is, you tested it on one machine. Isn't it like
that, you need several nodes and loads of data to really benefit from NoSQL
databases? At least this was my understanding of the whole thing...

greets,
Andreas

2011/4/13 Ian Dees 

>
> On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast  wrote:
>
>>  Interesting.
>>
>> How efficient is the (big)int indexing and/or masking?
>>
>
> I haven't had a chance to look at the integer indexing/masking. If I
> remember it from discussions on dev a long while ago I think it's very close
> to geohashes.
>
>
>>
>> Was this all on a single machine?
>>
>
> Yes.
>
>
>>
>>
>>
>>
>> On 4/12/2011 1:52 PM, Ian Dees wrote:
>>
>> Yep.
>>
>> On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast  wrote:
>>
>>>  and using the builtin spatial index?
>>>
>>>
>>>
>>> On 4/12/2011 1:50 PM, Ian Dees wrote:
>>>
>>> Yes, one document per node/way/relation.
>>>
>>> On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast  wrote:
>>>
  how was the data put in the db though? 1 document per node?


 On 4/12/2011 1:39 PM, Nolan Darilek wrote:

 Oopse, meant for this to go to the whole list.



  Original Message   Subject: Re: [OSM-dev] OSM and
 MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
To: Ian Dees
  

 I had/am having a somewhat bad experience storing OSM data in MongoDB.

 Initially I stored all map data in MongoDB, but queries took ages. The
 same queries that happen in 100-200 MS now often took nearly a second.
 Additionally, some took upwards of 5, and I even found spots on my map
 sparsely populated with points, but which reliably performed the queries I
 need in 30+ seconds.

 I filed a thorough bug in their tracker, including a dataset and queries
 that reliably duplicated the issue. It was marked wontfix, I abandoned
 MongoDB, and it was apparently re-opened and fixed several months later. So
 perhaps it's a non-issue now.

 I'm still using MongoDB for part of my current project, user POI
 storage. It does indeed use geohashes, and I'm experiencing strange 
 accuracy
 issues. My platform is pedestrian navigation with many small distance
 queries. Points in the non-MongoDB dataset are reliably detected in a 
 radius
 roughly 100 meters around the traveler. Points in MongoDB queried with the
 same bounding boxes don't appear until they're within 30-40 meters. I
 recently updated from an older version to a new build of 1.8. The older
 version widely varied the detection range. Some points were detected 100 or
 so meters out, while others weren't picked up until 30 or so. It was always
 the same points, too. The point for my apartment remains reliably visible
 for ~100 meters or so, while the corner store and restaurant didn't appear
 until I was very close. 1.8 at least appears to be consistent, always
 detecting at 30 meters or so. I can only assume that this is a geohash
 oddity that only appears for very small differences, something that works
 out to rounding error for larger values.

 I like MongoDB for many things, but not for geospatial data more
 complicated than a series of points. I'm working on migrating user/POI
 storage to a geospatial store.


 On 04/12/2011 01:20 PM, Ian Dees wrote:

 Yep, and I think Mongo uses geohashes as their index behind the scenes.
 One of the problems with that, though, is they have some arbitrary length
 that they compute the geohash to and when you have lots of points (as OSM
 data does) the buckets they're searching are very full.

 On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast wrote:

>  bbox queries using the built in spatial indexing presumably? OSM has
> it's own magical bitmask for that, that may also be as fast in mongo, who
> knows.
>
>
> On 4/11/2011 5:58 PM, Ian Dees wrote:
>
>  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo 
> wrote:
>
>>  Hi,
>>
>>
>>
>> I am working on evaluation of MongoDB for several storage solutions at
>> hand. Some of them resemble current OSM editing database. I have heard 
>> that
>> OSM dev is/was evaluating MongoDB also. I was wondering whether it 
>> possible
>> to share the findings?
>>
>>
>>
>
>  In my experimentation with MongoDB (seen here:
> https://github.com/iandees/mongosm/) I found it to be very slow.
> Inserts were speedy, but bounding-box queries took a long time.
>
>>>

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Greg Studer
MongoDB does use a geohash as the indexing method for geo-searches, but
pretty sure that's not the cause of the huge query times.  The
geohashing tends to be very fast, but the way points were buffered for
return in pre-1.9 releases could in particular point distributions cause
these slowdowns - I'm guessing the neighboring boxes had many more
points.

Exact point checks and distances are also being introduced in 1.9, so
when/if the hash isn't precise enough to complete your search, you
shouldn't get these types of inaccurate results (the hash is currently
tunable to 32 bits of precision).  Of course, these are all new
developments (along with polygon searches and multi-location documents),
geo-indexing has gotten a lot of attention as of late.

disclaimer: as per my email address, I work at 10gen on MongoDB

On Wed, 2011-04-13 at 08:52 -0500, Ian Dees wrote: 
> 
> 
> On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast 
> wrote:
> Interesting.
> 
> How efficient is the (big)int indexing and/or masking?
> 
> 
> 

> 
> I haven't had a chance to look at the integer indexing/masking. If I
> remember it from discussions on dev a long while ago I think it's very
> close to geohashes.
>  
> 
> Was this all on a single machine? 
> 
> 
> Yes.
>  
> 
> 
> 
> 
> 
> On 4/12/2011 1:52 PM, Ian Dees wrote: 
> > Yep.
> > 
> > On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast
> >  wrote: 
> > and using the builtin spatial index? 
> > 
> > 
> > 
> > On 4/12/2011 1:50 PM, Ian Dees wrote: 
> > > Yes, one document per node/way/relation.
> > > 
> > > On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast
> > >  wrote: 
> > > how was the data put in the db though? 1
> > > document per node? 
> > > 
> > > 
> > > On 4/12/2011 1:39 PM, Nolan Darilek
> > > wrote: 
> > > > Oopse, meant for this to go to the whole
> > > > list.
> > > > 
> > > > 
> > > > 
> > > >  Original Message  
> > > >Subject: 
> > > > Re: [OSM-dev] OSM
> > > > and MongoDB
> > > >   Date: 
> > > > Tue, 12 Apr 2011
> > > > 15:26:41 -0500
> > > >   From: 
> > > > Nolan Darilek
> > > > 
> > > > To: 
> > > > Ian Dees
> > > > 
> > > > 
> > > > 
> > > > I had/am having a somewhat bad
> > > > experience storing OSM data in MongoDB.
> > > > 
> > > > Initially I stored all map data in
> > > > MongoDB, but queries took ages. The same
> > > > queries that happen in 100-200 MS now
> > > > often took nearly a second.
> > > > Additionally, some took upwards of 5,
> > > > and I even found spots on my map
> > > > sparsely populated with points, but
> > > > which reliably performed the queries I
> > > > need in 30+ seconds.
> > > > 
> > > > I filed a thorough bug in their tracker,
> > > > including a dataset and queries that
> > > > reliably duplicated the issue. It was
> > > > marked wontfix, I abandoned MongoDB, and
> > > > it was apparently re-opened and fixed
> > > > several months later. So perhaps it's a
> > > > non-issue now.
> > > > 
> > > > I'm still using MongoDB for part of my
> > > > current project, user POI storage. It
> > > > does indeed use geohashes, and I'm
> > > > experiencing strange accuracy issues. My
> > > > platform is pedestrian navigation with
> > > > many small distance queries. Points in
> > > > the non-MongoDB dataset are reliably
> > > > detected in a radius roughly 100 meters
> > > > around the traveler. Points in MongoD

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-13 Thread Ian Dees
On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast  wrote:

>  Interesting.
>
> How efficient is the (big)int indexing and/or masking?
>

I haven't had a chance to look at the integer indexing/masking. If I
remember it from discussions on dev a long while ago I think it's very close
to geohashes.


>
> Was this all on a single machine?
>

Yes.


>
>
>
>
> On 4/12/2011 1:52 PM, Ian Dees wrote:
>
> Yep.
>
> On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast  wrote:
>
>>  and using the builtin spatial index?
>>
>>
>>
>> On 4/12/2011 1:50 PM, Ian Dees wrote:
>>
>> Yes, one document per node/way/relation.
>>
>> On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast  wrote:
>>
>>>  how was the data put in the db though? 1 document per node?
>>>
>>>
>>> On 4/12/2011 1:39 PM, Nolan Darilek wrote:
>>>
>>> Oopse, meant for this to go to the whole list.
>>>
>>>
>>>
>>>  Original Message   Subject: Re: [OSM-dev] OSM and
>>> MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
>>>To: Ian Dees
>>>  
>>>
>>> I had/am having a somewhat bad experience storing OSM data in MongoDB.
>>>
>>> Initially I stored all map data in MongoDB, but queries took ages. The
>>> same queries that happen in 100-200 MS now often took nearly a second.
>>> Additionally, some took upwards of 5, and I even found spots on my map
>>> sparsely populated with points, but which reliably performed the queries I
>>> need in 30+ seconds.
>>>
>>> I filed a thorough bug in their tracker, including a dataset and queries
>>> that reliably duplicated the issue. It was marked wontfix, I abandoned
>>> MongoDB, and it was apparently re-opened and fixed several months later. So
>>> perhaps it's a non-issue now.
>>>
>>> I'm still using MongoDB for part of my current project, user POI storage.
>>> It does indeed use geohashes, and I'm experiencing strange accuracy issues.
>>> My platform is pedestrian navigation with many small distance queries.
>>> Points in the non-MongoDB dataset are reliably detected in a radius roughly
>>> 100 meters around the traveler. Points in MongoDB queried with the same
>>> bounding boxes don't appear until they're within 30-40 meters. I recently
>>> updated from an older version to a new build of 1.8. The older version
>>> widely varied the detection range. Some points were detected 100 or so
>>> meters out, while others weren't picked up until 30 or so. It was always the
>>> same points, too. The point for my apartment remains reliably visible for
>>> ~100 meters or so, while the corner store and restaurant didn't appear until
>>> I was very close. 1.8 at least appears to be consistent, always detecting at
>>> 30 meters or so. I can only assume that this is a geohash oddity that only
>>> appears for very small differences, something that works out to rounding
>>> error for larger values.
>>>
>>> I like MongoDB for many things, but not for geospatial data more
>>> complicated than a series of points. I'm working on migrating user/POI
>>> storage to a geospatial store.
>>>
>>>
>>> On 04/12/2011 01:20 PM, Ian Dees wrote:
>>>
>>> Yep, and I think Mongo uses geohashes as their index behind the scenes.
>>> One of the problems with that, though, is they have some arbitrary length
>>> that they compute the geohash to and when you have lots of points (as OSM
>>> data does) the buckets they're searching are very full.
>>>
>>> On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast  wrote:
>>>
  bbox queries using the built in spatial indexing presumably? OSM has
 it's own magical bitmask for that, that may also be as fast in mongo, who
 knows.


 On 4/11/2011 5:58 PM, Ian Dees wrote:

  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo 
 wrote:

>  Hi,
>
>
>
> I am working on evaluation of MongoDB for several storage solutions at
> hand. Some of them resemble current OSM editing database. I have heard 
> that
> OSM dev is/was evaluating MongoDB also. I was wondering whether it 
> possible
> to share the findings?
>
>
>

  In my experimentation with MongoDB (seen here:
 https://github.com/iandees/mongosm/) I found it to be very slow.
 Inserts were speedy, but bounding-box queries took a long time.

  The most recent dev version of MongoDB includes "multi-location
 documents" support:

 http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

  This would allow a single way document to be indexed at multiple
 locations and vastly speed up the map query.


 ___
 dev mailing list
 [email protected]://lists.openstreetmap.org/listinfo/dev


 ___
 dev mailing list
 [email protected]
 http://lists.openstreetmap.org/listinfo/dev


>>>
>>> ___
>>> dev mailing 
>>> listdev@openstreetma

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Steve Coast

Interesting.

How efficient is the (big)int indexing and/or masking?

Was this all on a single machine?



On 4/12/2011 1:52 PM, Ian Dees wrote:

Yep.

On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast > wrote:


and using the builtin spatial index?



On 4/12/2011 1:50 PM, Ian Dees wrote:

Yes, one document per node/way/relation.

On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast mailto:[email protected]>> wrote:

how was the data put in the db though? 1 document per node?


On 4/12/2011 1:39 PM, Nolan Darilek wrote:

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek 

To: Ian Dees  



I had/am having a somewhat bad experience storing OSM data
in MongoDB.

Initially I stored all map data in MongoDB, but queries took
ages. The same queries that happen in 100-200 MS now often
took nearly a second. Additionally, some took upwards of 5,
and I even found spots on my map sparsely populated with
points, but which reliably performed the queries I need in
30+ seconds.

I filed a thorough bug in their tracker, including a dataset
and queries that reliably duplicated the issue. It was
marked wontfix, I abandoned MongoDB, and it was apparently
re-opened and fixed several months later. So perhaps it's a
non-issue now.

I'm still using MongoDB for part of my current project, user
POI storage. It does indeed use geohashes, and I'm
experiencing strange accuracy issues. My platform is
pedestrian navigation with many small distance queries.
Points in the non-MongoDB dataset are reliably detected in a
radius roughly 100 meters around the traveler. Points in
MongoDB queried with the same bounding boxes don't appear
until they're within 30-40 meters. I recently updated from
an older version to a new build of 1.8. The older version
widely varied the detection range. Some points were detected
100 or so meters out, while others weren't picked up until
30 or so. It was always the same points, too. The point for
my apartment remains reliably visible for ~100 meters or so,
while the corner store and restaurant didn't appear until I
was very close. 1.8 at least appears to be consistent,
always detecting at 30 meters or so. I can only assume that
this is a geohash oddity that only appears for very small
differences, something that works out to rounding error for
larger values.

I like MongoDB for many things, but not for geospatial data
more complicated than a series of points. I'm working on
migrating user/POI storage to a geospatial store.


On 04/12/2011 01:20 PM, Ian Dees wrote:

Yep, and I think Mongo uses geohashes as their index behind
the scenes. One of the problems with that, though, is they
have some arbitrary length that they compute the geohash to
and when you have lots of points (as OSM data does) the
buckets they're searching are very full.

On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast
mailto:[email protected]>> wrote:

bbox queries using the built in spatial indexing
presumably? OSM has it's own magical bitmask for that,
that may also be as fast in mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
mailto:[email protected]>>
wrote:

Hi,

I am working on evaluation of MongoDB for several
storage solutions at hand. Some of them resemble
current OSM editing database. I have heard that
OSM dev is/was evaluating MongoDB also. I was
wondering whether it possible to share the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be
very slow. Inserts were speedy, but bounding-box
queries took a long time.

The most recent dev version of MongoDB includes
"multi-location documents" support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed
at multiple locations and vastly speed up the map query.


___
dev mailing list
[email protected] 
http://lists.openstreetmap.

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Nolan Darilek

On 04/12/2011 03:47 PM, Steve Coast wrote:

how was the data put in the db though? 1 document per node?




Yes, with deeper structures for ways and relations.
___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Ian Dees
Yep.

On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast  wrote:

>  and using the builtin spatial index?
>
>
>
> On 4/12/2011 1:50 PM, Ian Dees wrote:
>
> Yes, one document per node/way/relation.
>
> On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast  wrote:
>
>>  how was the data put in the db though? 1 document per node?
>>
>>
>> On 4/12/2011 1:39 PM, Nolan Darilek wrote:
>>
>> Oopse, meant for this to go to the whole list.
>>
>>
>>
>>  Original Message   Subject: Re: [OSM-dev] OSM and
>> MongoDB  Date: Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
>>To: Ian Dees
>>  
>>
>> I had/am having a somewhat bad experience storing OSM data in MongoDB.
>>
>> Initially I stored all map data in MongoDB, but queries took ages. The
>> same queries that happen in 100-200 MS now often took nearly a second.
>> Additionally, some took upwards of 5, and I even found spots on my map
>> sparsely populated with points, but which reliably performed the queries I
>> need in 30+ seconds.
>>
>> I filed a thorough bug in their tracker, including a dataset and queries
>> that reliably duplicated the issue. It was marked wontfix, I abandoned
>> MongoDB, and it was apparently re-opened and fixed several months later. So
>> perhaps it's a non-issue now.
>>
>> I'm still using MongoDB for part of my current project, user POI storage.
>> It does indeed use geohashes, and I'm experiencing strange accuracy issues.
>> My platform is pedestrian navigation with many small distance queries.
>> Points in the non-MongoDB dataset are reliably detected in a radius roughly
>> 100 meters around the traveler. Points in MongoDB queried with the same
>> bounding boxes don't appear until they're within 30-40 meters. I recently
>> updated from an older version to a new build of 1.8. The older version
>> widely varied the detection range. Some points were detected 100 or so
>> meters out, while others weren't picked up until 30 or so. It was always the
>> same points, too. The point for my apartment remains reliably visible for
>> ~100 meters or so, while the corner store and restaurant didn't appear until
>> I was very close. 1.8 at least appears to be consistent, always detecting at
>> 30 meters or so. I can only assume that this is a geohash oddity that only
>> appears for very small differences, something that works out to rounding
>> error for larger values.
>>
>> I like MongoDB for many things, but not for geospatial data more
>> complicated than a series of points. I'm working on migrating user/POI
>> storage to a geospatial store.
>>
>>
>> On 04/12/2011 01:20 PM, Ian Dees wrote:
>>
>> Yep, and I think Mongo uses geohashes as their index behind the scenes.
>> One of the problems with that, though, is they have some arbitrary length
>> that they compute the geohash to and when you have lots of points (as OSM
>> data does) the buckets they're searching are very full.
>>
>> On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast  wrote:
>>
>>>  bbox queries using the built in spatial indexing presumably? OSM has
>>> it's own magical bitmask for that, that may also be as fast in mongo, who
>>> knows.
>>>
>>>
>>> On 4/11/2011 5:58 PM, Ian Dees wrote:
>>>
>>>  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo wrote:
>>>
  Hi,



 I am working on evaluation of MongoDB for several storage solutions at
 hand. Some of them resemble current OSM editing database. I have heard that
 OSM dev is/was evaluating MongoDB also. I was wondering whether it possible
 to share the findings?



>>>
>>>  In my experimentation with MongoDB (seen here:
>>> https://github.com/iandees/mongosm/) I found it to be very slow. Inserts
>>> were speedy, but bounding-box queries took a long time.
>>>
>>>  The most recent dev version of MongoDB includes "multi-location
>>> documents" support:
>>>
>>> http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments
>>>
>>>  This would allow a single way document to be indexed at multiple
>>> locations and vastly speed up the map query.
>>>
>>>
>>> ___
>>> dev mailing list
>>> [email protected]://lists.openstreetmap.org/listinfo/dev
>>>
>>>
>>> ___
>>> dev mailing list
>>> [email protected]
>>> http://lists.openstreetmap.org/listinfo/dev
>>>
>>>
>>
>> ___
>> dev mailing 
>> [email protected]://lists.openstreetmap.org/listinfo/dev
>>
>>
>>
>> ___
>> dev mailing 
>> [email protected]://lists.openstreetmap.org/listinfo/dev
>>
>>
>> ___
>> dev mailing list
>> [email protected]
>> http://lists.openstreetmap.org/listinfo/dev
>>
>>
>
___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Steve Coast

and using the builtin spatial index?


On 4/12/2011 1:50 PM, Ian Dees wrote:

Yes, one document per node/way/relation.

On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast > wrote:


how was the data put in the db though? 1 document per node?


On 4/12/2011 1:39 PM, Nolan Darilek wrote:

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek 

To: Ian Dees  



I had/am having a somewhat bad experience storing OSM data in
MongoDB.

Initially I stored all map data in MongoDB, but queries took
ages. The same queries that happen in 100-200 MS now often took
nearly a second. Additionally, some took upwards of 5, and I even
found spots on my map sparsely populated with points, but which
reliably performed the queries I need in 30+ seconds.

I filed a thorough bug in their tracker, including a dataset and
queries that reliably duplicated the issue. It was marked
wontfix, I abandoned MongoDB, and it was apparently re-opened and
fixed several months later. So perhaps it's a non-issue now.

I'm still using MongoDB for part of my current project, user POI
storage. It does indeed use geohashes, and I'm experiencing
strange accuracy issues. My platform is pedestrian navigation
with many small distance queries. Points in the non-MongoDB
dataset are reliably detected in a radius roughly 100 meters
around the traveler. Points in MongoDB queried with the same
bounding boxes don't appear until they're within 30-40 meters. I
recently updated from an older version to a new build of 1.8. The
older version widely varied the detection range. Some points were
detected 100 or so meters out, while others weren't picked up
until 30 or so. It was always the same points, too. The point for
my apartment remains reliably visible for ~100 meters or so,
while the corner store and restaurant didn't appear until I was
very close. 1.8 at least appears to be consistent, always
detecting at 30 meters or so. I can only assume that this is a
geohash oddity that only appears for very small differences,
something that works out to rounding error for larger values.

I like MongoDB for many things, but not for geospatial data more
complicated than a series of points. I'm working on migrating
user/POI storage to a geospatial store.


On 04/12/2011 01:20 PM, Ian Dees wrote:

Yep, and I think Mongo uses geohashes as their index behind the
scenes. One of the problems with that, though, is they have some
arbitrary length that they compute the geohash to and when you
have lots of points (as OSM data does) the buckets they're
searching are very full.

On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast mailto:[email protected]>> wrote:

bbox queries using the built in spatial indexing presumably?
OSM has it's own magical bitmask for that, that may also be
as fast in mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
mailto:[email protected]>> wrote:

Hi,

I am working on evaluation of MongoDB for several
storage solutions at hand. Some of them resemble
current OSM editing database. I have heard that OSM dev
is/was evaluating MongoDB also. I was wondering whether
it possible to share the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be very
slow. Inserts were speedy, but bounding-box queries took a
long time.

The most recent dev version of MongoDB includes
"multi-location documents" support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed at
multiple locations and vastly speed up the map query.


___
dev mailing list
[email protected] 
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
[email protected] 
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
[email protected]  
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
[email protected]  
http://lists.openstreetmap.org/listinfo/dev


   

Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Ian Dees
Yes, one document per node/way/relation.

On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast  wrote:

>  how was the data put in the db though? 1 document per node?
>
>
> On 4/12/2011 1:39 PM, Nolan Darilek wrote:
>
> Oopse, meant for this to go to the whole list.
>
>
>
>  Original Message   Subject: Re: [OSM-dev] OSM and MongoDB  
> Date:
> Tue, 12 Apr 2011 15:26:41 -0500  From: Nolan Darilek
>To: Ian Dees
>  
>
> I had/am having a somewhat bad experience storing OSM data in MongoDB.
>
> Initially I stored all map data in MongoDB, but queries took ages. The same
> queries that happen in 100-200 MS now often took nearly a second.
> Additionally, some took upwards of 5, and I even found spots on my map
> sparsely populated with points, but which reliably performed the queries I
> need in 30+ seconds.
>
> I filed a thorough bug in their tracker, including a dataset and queries
> that reliably duplicated the issue. It was marked wontfix, I abandoned
> MongoDB, and it was apparently re-opened and fixed several months later. So
> perhaps it's a non-issue now.
>
> I'm still using MongoDB for part of my current project, user POI storage.
> It does indeed use geohashes, and I'm experiencing strange accuracy issues.
> My platform is pedestrian navigation with many small distance queries.
> Points in the non-MongoDB dataset are reliably detected in a radius roughly
> 100 meters around the traveler. Points in MongoDB queried with the same
> bounding boxes don't appear until they're within 30-40 meters. I recently
> updated from an older version to a new build of 1.8. The older version
> widely varied the detection range. Some points were detected 100 or so
> meters out, while others weren't picked up until 30 or so. It was always the
> same points, too. The point for my apartment remains reliably visible for
> ~100 meters or so, while the corner store and restaurant didn't appear until
> I was very close. 1.8 at least appears to be consistent, always detecting at
> 30 meters or so. I can only assume that this is a geohash oddity that only
> appears for very small differences, something that works out to rounding
> error for larger values.
>
> I like MongoDB for many things, but not for geospatial data more
> complicated than a series of points. I'm working on migrating user/POI
> storage to a geospatial store.
>
>
> On 04/12/2011 01:20 PM, Ian Dees wrote:
>
> Yep, and I think Mongo uses geohashes as their index behind the scenes. One
> of the problems with that, though, is they have some arbitrary length that
> they compute the geohash to and when you have lots of points (as OSM data
> does) the buckets they're searching are very full.
>
> On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast  wrote:
>
>>  bbox queries using the built in spatial indexing presumably? OSM has it's
>> own magical bitmask for that, that may also be as fast in mongo, who knows.
>>
>>
>> On 4/11/2011 5:58 PM, Ian Dees wrote:
>>
>>  On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> I am working on evaluation of MongoDB for several storage solutions at
>>> hand. Some of them resemble current OSM editing database. I have heard that
>>> OSM dev is/was evaluating MongoDB also. I was wondering whether it possible
>>> to share the findings?
>>>
>>>
>>>
>>
>>  In my experimentation with MongoDB (seen here:
>> https://github.com/iandees/mongosm/) I found it to be very slow. Inserts
>> were speedy, but bounding-box queries took a long time.
>>
>>  The most recent dev version of MongoDB includes "multi-location
>> documents" support:
>>
>> http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments
>>
>>  This would allow a single way document to be indexed at multiple
>> locations and vastly speed up the map query.
>>
>>
>> ___
>> dev mailing list
>> [email protected]://lists.openstreetmap.org/listinfo/dev
>>
>>
>> ___
>> dev mailing list
>> [email protected]
>> http://lists.openstreetmap.org/listinfo/dev
>>
>>
>
> ___
> dev mailing 
> [email protected]://lists.openstreetmap.org/listinfo/dev
>
>
>
> ___
> dev mailing 
> [email protected]://lists.openstreetmap.org/listinfo/dev
>
>
> ___
> dev mailing list
> [email protected]
> http://lists.openstreetmap.org/listinfo/dev
>
>
___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Steve Coast

how was the data put in the db though? 1 document per node?

On 4/12/2011 1:39 PM, Nolan Darilek wrote:

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek 
To: Ian Dees 



I had/am having a somewhat bad experience storing OSM data in MongoDB.

Initially I stored all map data in MongoDB, but queries took ages. The 
same queries that happen in 100-200 MS now often took nearly a second. 
Additionally, some took upwards of 5, and I even found spots on my map 
sparsely populated with points, but which reliably performed the 
queries I need in 30+ seconds.


I filed a thorough bug in their tracker, including a dataset and 
queries that reliably duplicated the issue. It was marked wontfix, I 
abandoned MongoDB, and it was apparently re-opened and fixed several 
months later. So perhaps it's a non-issue now.


I'm still using MongoDB for part of my current project, user POI 
storage. It does indeed use geohashes, and I'm experiencing strange 
accuracy issues. My platform is pedestrian navigation with many small 
distance queries. Points in the non-MongoDB dataset are reliably 
detected in a radius roughly 100 meters around the traveler. Points in 
MongoDB queried with the same bounding boxes don't appear until 
they're within 30-40 meters. I recently updated from an older version 
to a new build of 1.8. The older version widely varied the detection 
range. Some points were detected 100 or so meters out, while others 
weren't picked up until 30 or so. It was always the same points, too. 
The point for my apartment remains reliably visible for ~100 meters or 
so, while the corner store and restaurant didn't appear until I was 
very close. 1.8 at least appears to be consistent, always detecting at 
30 meters or so. I can only assume that this is a geohash oddity that 
only appears for very small differences, something that works out to 
rounding error for larger values.


I like MongoDB for many things, but not for geospatial data more 
complicated than a series of points. I'm working on migrating user/POI 
storage to a geospatial store.



On 04/12/2011 01:20 PM, Ian Dees wrote:
Yep, and I think Mongo uses geohashes as their index behind the 
scenes. One of the problems with that, though, is they have some 
arbitrary length that they compute the geohash to and when you have 
lots of points (as OSM data does) the buckets they're searching are 
very full.


On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast > wrote:


bbox queries using the built in spatial indexing presumably? OSM
has it's own magical bitmask for that, that may also be as fast
in mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
mailto:[email protected]>> wrote:

Hi,

I am working on evaluation of MongoDB for several storage
solutions at hand. Some of them resemble current OSM editing
database. I have heard that OSM dev is/was evaluating
MongoDB also. I was wondering whether it possible to share
the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be very slow.
Inserts were speedy, but bounding-box queries took a long time.

The most recent dev version of MongoDB includes "multi-location
documents" support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed at multiple
locations and vastly speed up the map query.


___
dev mailing list
[email protected] 
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
[email protected] 
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev
___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev


[OSM-dev] Fwd: Re: OSM and MongoDB

2011-04-12 Thread Nolan Darilek

Oopse, meant for this to go to the whole list.



 Original Message 
Subject:Re: [OSM-dev] OSM and MongoDB
Date:   Tue, 12 Apr 2011 15:26:41 -0500
From:   Nolan Darilek 
To: Ian Dees 



I had/am having a somewhat bad experience storing OSM data in MongoDB.

Initially I stored all map data in MongoDB, but queries took ages. The 
same queries that happen in 100-200 MS now often took nearly a second. 
Additionally, some took upwards of 5, and I even found spots on my map 
sparsely populated with points, but which reliably performed the queries 
I need in 30+ seconds.


I filed a thorough bug in their tracker, including a dataset and queries 
that reliably duplicated the issue. It was marked wontfix, I abandoned 
MongoDB, and it was apparently re-opened and fixed several months later. 
So perhaps it's a non-issue now.


I'm still using MongoDB for part of my current project, user POI 
storage. It does indeed use geohashes, and I'm experiencing strange 
accuracy issues. My platform is pedestrian navigation with many small 
distance queries. Points in the non-MongoDB dataset are reliably 
detected in a radius roughly 100 meters around the traveler. Points in 
MongoDB queried with the same bounding boxes don't appear until they're 
within 30-40 meters. I recently updated from an older version to a new 
build of 1.8. The older version widely varied the detection range. Some 
points were detected 100 or so meters out, while others weren't picked 
up until 30 or so. It was always the same points, too. The point for my 
apartment remains reliably visible for ~100 meters or so, while the 
corner store and restaurant didn't appear until I was very close. 1.8 at 
least appears to be consistent, always detecting at 30 meters or so. I 
can only assume that this is a geohash oddity that only appears for very 
small differences, something that works out to rounding error for larger 
values.


I like MongoDB for many things, but not for geospatial data more 
complicated than a series of points. I'm working on migrating user/POI 
storage to a geospatial store.



On 04/12/2011 01:20 PM, Ian Dees wrote:
Yep, and I think Mongo uses geohashes as their index behind the 
scenes. One of the problems with that, though, is they have some 
arbitrary length that they compute the geohash to and when you have 
lots of points (as OSM data does) the buckets they're searching are 
very full.


On Tue, Apr 12, 2011 at 1:00 PM, Steve Coast > wrote:


bbox queries using the built in spatial indexing presumably? OSM
has it's own magical bitmask for that, that may also be as fast in
mongo, who knows.


On 4/11/2011 5:58 PM, Ian Dees wrote:

On Mon, Apr 11, 2011 at 6:36 PM, Sergey Galuzo
mailto:[email protected]>> wrote:

Hi,

I am working on evaluation of MongoDB for several storage
solutions at hand. Some of them resemble current OSM editing
database. I have heard that OSM dev is/was evaluating MongoDB
also. I was wondering whether it possible to share the findings?


In my experimentation with MongoDB (seen here:
https://github.com/iandees/mongosm/) I found it to be very slow.
Inserts were speedy, but bounding-box queries took a long time.

The most recent dev version of MongoDB includes "multi-location
documents" support:

http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments

This would allow a single way document to be indexed at multiple
locations and vastly speed up the map query.


___
dev mailing list
[email protected] 
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
[email protected] 
http://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev