Re: [OSM-dev] osm2pgsql diff imports benchmarks
On 3/18/2012 4:07 PM, Kai Krueger wrote: sylvain letuffe wrote It turned out that the index on the pending field wasn't used at all and led to a full scan of the planet_osm_ways table, increasing noticeably the diff import. I think I have seen that one before as well. If I remember correctly, it was sufficient to run a simple analyze on the table. No need to do a re-index. It probably thinks a too large proportion of the ways are pending and therefore decides it is best to do a seq scan. Possibly because before the going over pending ways stage about 50% of ways are pending during the import. Osm2pgsql should be doing a analyze at the end of the import though, so I am not sure why this is happening. I have a planet database that has been imported and has been attempting to catch up for a while now (3 days to go) and it's been showing extreme data throughput spikes when entering the pending ways phase, so I decided to check this out. An analyze command show that a sequential scan is, in fact, being done: gis= explain select id from planet_osm_ways where pending; Seq Scan on planet_osm_ways (cost=0.00..5253263.74 rows=65592587 width=4) Filter: pending I'm not sure why, but I got an error attempting to analyze verbose planet_osm_ways as www-data, but it seems to execute as the postgres user: gis= analyze verbose planet_osm_ways; WARNING: skipping planet_osm_ways --- only table or database owner can analyze it The analyze output really didn't tell me much: gis=# analyze verbose planet_osm_ways; INFO: analyzing public.planet_osm_ways INFO: planet_osm_ways: scanned 3 of 3941416 pages, containing 983808 live rows and 589826 dead rows; 3 rows in sample, 131170601 estimated total rows But a subsequent explain certainly looks promising: gis= explain select id from planet_osm_ways where pending; Index Scan using planet_osm_ways_idx on planet_osm_ways (cost=0.00..916513.78 rows=43724 width=4) Now to wait for my next 12 hour catchup chunk to get to the pending ways phase,probably in about 6 hours or so. Lynn (D) - KJ4ERJ PS. If anyone gets this far, are those dead rows a function of my running with autovaccuum off? And should I be doing a periodic vaccuum to clean out accumulated cruft? I was thinking that not too much in the osm planet would be deleting, but maybe my assumption is incorrect? ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] osm2pgsql diff imports benchmarks
Am 19.03.2012 12:37, schrieb Lynn W. Deffenbaugh (Mr): And should I be doing a periodic vaccuum to clean out accumulated cruft? I was thinking that not too much in the osm planet would be deleting, but maybe my assumption is incorrect? Yes and yes. osm2pgsql does not update rows, it deletes them and inserts the new version. You really need to run vacuum at least once per week, probably more right now. Simon ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] osm2pgsql diff imports benchmarks
gis= explain select id from planet_osm_ways where pending; Index Scan using planet_osm_ways_idx on planet_osm_ways (cost=0.00..916513.78 rows=43724 width=4) This is what I've got after a re-index : gis=# explain select id from planet_osm_ways where pending; QUERY PLAN -- Index Scan using planet_osm_ways_idx on planet_osm_ways (cost=0.00..1157.47 rows=1 width=4) (1 ligne) and that saves me around ~20 seconds per minute diff -- sly qui suis-je : http://sly.letuffe.org email perso : sylvain chez letuffe un point org ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
[OSM-dev] osm2pgsql diff imports benchmarks
Hi, In the process of trying to speed up diff imports with osm2pgsql, I'm searching for a typical output of osm2pgsql importing one minute. I know of : http://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks But beside the one I just wrote, there are no output of what is expected for a one minute (or few minutes) diff import. Could someone operating and maintaining a planet osm2pgsql database be so kind as to provide me one (or more in order to have average metrics) ? On a side note, while loging long queries I ran into a weird long one about retrieving pending ways during diff import. It turned out that the index on the pending field wasn't used at all and led to a full scan of the planet_osm_ways table, increasing noticeably the diff import. I don't feel I have imported data in a weird way and don't know if others are affected by that, but you can check if you are affected by runing : select id from planet_osm_way where pending which should run in less than a second if index is well used. To solve the problem post import, we did re-index and ran analyse reindex index planet_osm_ways_idx; analyse; -- sly (sylvain letuffe) ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev