hi mark, thanks for your quick reply. i have tried your advice and rebuilded all my gist indexes on the table (via drop index, create index) and run a vacuum full afterwards. but that did not change the fact, that the query on the_geom is much slower than the same one on the_geom_1.
after that my colleague has tried another thing out: "I have a small table (about 200 entries) containing large (multipolygon) geometries (as discussed in http://postgis.refractions.net/docs/ch05.html), stored in the_geom. Another column the_geom_envelope holds its envelope geometries. Each column is indexed using GIST. I experienced that doing a simple bounding box query results in different times, dependent on the queried column: '...WHERE the_geom && expr;' takes about 250ms, '...WHERE the_geom_envelope && expr;' takes about 20ms. ANALYZE says that GIST is used. As far as I know, the GIST makes use only of the bounding boxes of Geometry objects, which are equal for each object. How does it come that the query time using the_geom is more than *ten times slower* than the query time using the_geom_envelope?" which to me basically seems to be the same problem. any ideas? thanks in advance, stefan > -----Ursprüngliche Nachricht----- > Von: PostGIS Users Discussion <postgis-users@postgis.refractions.net> > Gesendet: 06.09.07 09:40:18 > An: PostGIS Users Discussion <postgis-users@postgis.refractions.net> > Betreff: Re: [postgis-users] question on gist performance > > On Wed, 2007-09-05 at 18:12 +0200, Stefan Zweig wrote: > > hi list, > > > > i have noticed a strange thing while selecting geometries from a table and > > using the && operator (gist): > > > > i have a table which holds all countrys of the world (somewhat more than > > 200) an their boundaries in column the_geom > > the boundaries have a quite good resolution, so the total amount of points > > is huge. > > for better performance (if low resolution is needed) i created a second > > column > > > > the_geom_1 as: > > UPDATE table SET the_geom_1=simplify(the_geom,0.01) > > > > which results in a less total amount of points compared to the original > > data. > > > > i have set up a gist index on both columns > > > > and now the strange thing: > > SELECT name FROM _g2965 WHERE the_geom && SetSrid('BOX(13.760675 > > 51.171104,15.231802 51.83737)'::box2d,4326); > > takes 297ms > > > > SELECT name FROM _g2965 WHERE the_geom_1 && SetSrid('BOX(13.760675 > > 51.171104,15.231802 51.83737)'::box2d,4326); > > takes 15ms > > > > actually i would have thought that the query using the gist index should > > take the same time on both columns, because gist index strategy is > > comparing bounding boxes of the geometries with the bounding box of my > > query, isn't it? > > can anybody give me a hint why there is such a massive time difference when > > selecting from different columns? > > > > > > thanks in advance, stefan > > > Hi Stefan, > > I suspect that this because of your UPDATE statement. For each row that > is touched during an UPDATE, a new entry is created in the GiST index - > for various transactional reasons, the old version cannot be deleted. > > For example: imagine I have your table with containing 200 countries of > the world. When first created, your heap/indices look like this: > > table: 200 rows > the_geom_gist: 200 rows > > Now you do your UPDATE table SET the_geom_1=simplify(the_geom,0.01): > this creates new versions of each row in both the heap and the index. > > table: 400 rows (200 live) > the_geom_gist: 400 rows (200 live) > > If you now build your second index: > > table: 400 rows (200 live) > the_geom_gist: 400 rows (200 live) > the_geom_1_gist: 200 rows > > The second index is smaller because the table only contained 200 live > rows, and so these were the only ones used to build the index. Now I see > from your schema that you've done this another two times which gives > this: > > table: 800 rows (200 live) > the_geom_gist: 800 rows (200 live) > the_geom_1_gist: 600 rows (200 live) > the_geom_2_gist: 400 rows (200 live) > the_geom_3_gist: 200 rows > > So as you can see the original index is 4 times the size of the original > which is why it takes longer to traverse. The quick solution would be to > rebuild all of the GiST indices on your table (if these tables are > fairly static you may wish to do a VACUUM FULL too). However, I would > probably create one table with just a primary key and geometry column > for each different resolution, and then perform a join onto your > original table based upon the primary key, dropping the geometry columns > from your original table. > > > HTH, > > Mark. > > -- > ILande - Open Source Consultancy > http://www.ilande.co.uk > > > _______________________________________________ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > _______________________________________________________________________ Jetzt neu! Schützen Sie Ihren PC mit McAfee und WEB.DE. 3 Monate kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220 _______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users