Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-14 Thread Florian Lohoff
On Fri, Sep 12, 2008 at 04:50:00PM +0200, Andreas Kalsch wrote:
 Subject: Re: [OSM-dev] Spatial vs. multi-column indexes for points

 Thanks!
 
 And this really outperforms a simple query like ... latitude between
 y1 and y2 and longitude between x1 and x2 ?

This would either need a combined multi-column index or be expensive
because 2 seperated indexes and a merge of the results which the
database will probably do as a full_seq_scan.

 I will use 2^24 tiles as length. It looks like tiles_for_area goes
 through all tiles which are in the bounding box. So if I have an area
 with e.g. 5x5 kilometers, it will loop through 2000x2000 = 4 Million
 tiles. And so the SQL for this are will be quite big, won't it?

IIRC it uses the top most 16 bit of lat and lon and shifts them together
in an int32. As lat lon bits get shifted in round-robin you have most
likely a very good distribution and a very efficient index on lat and
lon.

For the query bbox=8.3426546875,51.88222734375,8.4305453125,51.92617265625

it results in this:

tile BETWEEN 3496212715 AND 3496212719 OR tile BETWEEN 3496212728 AND
3496212735 OR tile BETWEEN 3496212904 AND 3496212911 OR tile BETWEEN
3496212920 AND 3496212927 OR tile BETWEEN 3496212968 AND 3496212971 OR
tile BETWEEN 3496213059 AND 3496213063 OR tile BETWEEN 3496213067 AND
3496213087 OR tile BETWEEN 3496213091 AND 3496213095 OR tile BETWEEN
3496213099 AND 3496213119 OR tile BETWEEN 3496213187 AND 3496213191 OR
tile BETWEEN 3496213195 AND 3496213215 OR tile BETWEEN 3496213219 AND
3496213223 OR tile BETWEEN 3496213232 AND 3496213241 OR tile BETWEEN
3496213248 AND 3496213315 OR tile BETWEEN 3496213320 AND 3496213323 OR
tile BETWEEN 3496213344 AND 3496213347 OR tile BETWEEN 3496213352 AND
3496213355 OR tile BETWEEN 3496213376 AND 3496213417 OR tile BETWEEN
3496213424 AND 3496213433 OR tile BETWEEN 3496213440 AND 3496213443 OR
tile BETWEEN 3496213448 AND 3496213451 OR tile BETWEEN 3496213472 AND
3496213475 OR tile IN (3496212713,3496213057,3496213065,3496213089,
3496213097,3496213185,3496213193,3496213217,3496213225,3496213228,
3496213229,3496213244,3496213245,3496213420,3496213421,3496213436,
3496213437,3496213480,3496213481)

Anyway - postgres/postgis is faster than mysql and i prefer postgres
anyway so i'd not have to deal with this.

Flo
-- 
Florian Lohoff  [EMAIL PROTECTED] +49-171-2280134
Those who would give up a little freedom to get a little 
  security shall soon have neither - Benjamin Franklin


signature.asc
Description: Digital signature
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-12 Thread Andreas Kalsch
 On Thu, Sep 11, 2008 at 08:31:23PM +0200, Andreas Kalsch wrote:
  All, thanks for your quick responses!
  
  Quad tiles look like a smart way to create an index. So to lookup a
  single point or a quad tile, this is fine. But for my application I
  need another lookup - by bounding box with any ratio and size. Is
  there a way to look up a special bounding box with this index? I
  think this will be a little more complicated without conventional
  multi-column indices. Hmm, I think you could take a maximum number
  of quad lookups which contain the requested box, what do you think?
 
 The code in the api calculates all possible quadtile areas and does a 
 quadtile in ( x, y, z, a, b, c, d, e, f ) or quadtile between x and y
 etc

Can you give me the position of this code in SVN, I am sure will understand it 
more deeply then.
-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-12 Thread Andreas Kalsch
  My current optimization includes:
 
  
  - using mediumint for lat/lon - enough for ~2 meters resolution
  - using a bounding box first for point+radius calculation and then
 selecting the circle with a Pythagoras approximation, which is exact enough
  
  For the first, I want to use MySQL.
  
  If you are interested in some benchmark data, I can post ...
 
 As an aside, there is currently a lack of good benchmarks on the different
 optimizations  hacks for spatial querying, in different circumstances,
 so anything contributing towards guidelines for developers would be most
 helpful

OK, some things I have quickly writte down while testing, made on:
- MySQL 5.0.67
- MacBook Intel Core 2 Duo 
- 2 GHz
- Mac OS 10.4

Time in secs

There are several ways to get the points within a circle ...
- I precompute some values which will be constant during the query - this is 
quicker
- I use mediumints for lat/lon

1) correct, bust most expensive:
SET @latitude=48;
SET @longitude=13;
SET @latitudeM = getMed(@latitude); /* MySQL function */
SET @longitudeM = getMed(@longitude);
SET @latitudeSin=sin(radians(@latitude));
SET @latitudeCos=cos(radians(@latitude));
SELECT BENCHMARK(
100,
degrees(acos(
@latitudeSin*sin(radians(230)) 
+ @latitudeCos*cos(radians(230))*cos(radians([EMAIL PROTECTED]))
)));
.62 secs

2) approximation - really good results for radius up to 10-20°:
SET @longitudeFactor=cos(radians(@latitude));
SELECT BENCHMARK(
100, 
sqrt(pow(@latitudeM - 230, 2) + pow(@longitudeFactor * (@longitudeM 
- 25), 2))
);
.40 secs

3) OK, sqrt can be omitted in the resulting query
SELECT BENCHMARK(
100, 
pow(@latitudeM - 230, 2) + pow(@longitudeFactor * (@longitudeM - 
25), 2)
);
.36 secs


If there are more people interested in concrete computation based on 500.000 
rows I can post them, too.

Andi
-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-12 Thread Florian Lohoff
On Fri, Sep 12, 2008 at 08:20:18AM +0200, Andreas Kalsch wrote:
 Subject: Re: [OSM-dev] Spatial vs. multi-column indexes for points
 
  On Thu, Sep 11, 2008 at 08:31:23PM +0200, Andreas Kalsch wrote:
   All, thanks for your quick responses!
   
   Quad tiles look like a smart way to create an index. So to lookup a
   single point or a quad tile, this is fine. But for my application I
   need another lookup - by bounding box with any ratio and size. Is
   there a way to look up a special bounding box with this index? I
   think this will be a little more complicated without conventional
   multi-column indices. Hmm, I think you could take a maximum number
   of quad lookups which contain the requested box, what do you think?
  
  The code in the api calculates all possible quadtile areas and does a 
  quadtile in ( x, y, z, a, b, c, d, e, f ) or quadtile between x and y
  etc
 
 Can you give me the position of this code in SVN, I am sure will understand 
 it more deeply then.

applications/utils/export/osm2ai/osm2ai.pl

Is a perl variant - look for sql_for_area

In the rails_ports its done with a C helper - look in

sites/rails_port/lib/quad_tile/

Flo
-- 
Florian Lohoff  [EMAIL PROTECTED] +49-171-2280134
Those who would give up a little freedom to get a little 
  security shall soon have neither - Benjamin Franklin


signature.asc
Description: Digital signature
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-11 Thread Sascha Silbe

On Thu, Sep 11, 2008 at 01:49:55PM +0200, Andreas Kalsch wrote:

Has anybody used GIS successfully in MySQL or PGSQL and can tell me 
how the performance compares between the two techniques?
At least in PostgreSQL/PostGIS, the geometric indices are WAY faster. 
Just remember to either use the ST_* family of predicates (implicit 
bounding box scan) or add an  term to do an explicit bounding box 
scan so the index is actually used.


CU Sascha

--
http://sascha.silbe.org/
http://www.infra-silbe.de/


signature.asc
Description: Digital signature
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-11 Thread Andreas Kalsch
All, thanks for your quick responses!

Quad tiles look like a smart way to create an index. So to lookup a single 
point or a quad tile, this is fine. But for my application I need another 
lookup - by bounding box with any ratio and size. Is there a way to look up a 
special bounding box with this index? I think this will be a little more 
complicated without conventional multi-column indices. Hmm, I think you could 
take a maximum number of quad lookups which contain the requested box, what do 
you think?

My current optimization includes:

- using mediumint for lat/lon - enough for ~2 meters resolution
- using a bounding box first for point+radius calculation and then selecting 
the circle with a Pythagoras approximation, which is exact enough

For the first, I want to use MySQL.

If you are interested in some benchmark data, I can post ...

 Instead of indexing by precise coordinates you could index by virtual
 tiles as it is done in OSM's main DB since a year ago with nice
 performance boost:
 
 http://wiki.openstreetmap.org/index.php/QuadTiles
 
 good luck,
 Stefan
 
 
 On Thu, Sep 11, 2008 at 1:49 PM, Andreas Kalsch [EMAIL PROTECTED]
 wrote:
  Hey,
 
  last week I made some experiments with huge datasets of lat/lon points.
 I use MySQL 5.0, which partially support GIS extensions, including R-trees.
 But it is still not able to make queries based on the GIS features, so I
 have to use the normal way - multi-column indexes on lat/lon columns. It
 works well but probably there is a way to make it even quicker ;)
 
  Has anybody used GIS successfully in MySQL or PGSQL and can tell me how
 the performance compares between the two techniques?
 
  Thanks,
 
  Andi
  --
  Pt! Schon das coole Video vom GMX MultiMessenger gesehen?
  Der Eine für Alle: http://www.gmx.net/de/go/messenger03
 
  ___
  dev mailing list
  dev@openstreetmap.org
  http://lists.openstreetmap.org/listinfo/dev
 

-- 
Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: 
http://www.gmx.net/de/go/multimessenger

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-11 Thread Ian Dees
On Thu, Sep 11, 2008 at 1:31 PM, Andreas Kalsch [EMAIL PROTECTED]wrote:

 All, thanks for your quick responses!

 Quad tiles look like a smart way to create an index. So to lookup a single
 point or a quad tile, this is fine. But for my application I need another
 lookup - by bounding box with any ratio and size. Is there a way to look up
 a special bounding box with this index?


Take a look at the MySQL 5 documentation here:

http://dev.mysql.com/doc/refman/5.0/en/relations-on-geometry-mbr.html

When I was working on a read-only mirror of the OSM API, I created a
LINESTRING with the top left and bottom right as the end points. This
created a bounding box in MySQL that I could then use in one of the MBR
functions on that page.

Check out http://svn.yellowbkpk.com/geo/trunk/bbox.php to see the SQL
queries I was using. They aren't very good, but they'll show you how to use
the MBR stuff in MySQL.
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Spatial vs. multi-column indexes for points

2008-09-11 Thread Mikel Maron
 My current optimization includes:

 
 - using mediumint for lat/lon - enough for ~2 meters resolution
 - using a bounding box first for point+radius calculation and then selecting 
 the circle with a Pythagoras approximation, which is exact enough
 
 For the first, I want to use MySQL.
 
 If you are interested in some benchmark data, I can post ...

As an aside, there is currently a lack of good benchmarks on the different 
optimizations  hacks for spatial querying, in different circumstances,
so anything contributing towards guidelines for developers would be most helpful
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev