On 3/18/2012 4:07 PM, Kai Krueger wrote:
sylvain letuffe wrote
It turned out that the index on the "pending" field wasn't used at all and
led
to a full scan of the planet_osm_ways table, increasing noticeably the
diff
import.

I think I have seen that one before as well. If I remember correctly, it was
sufficient to run a simple analyze on the table. No need to do a re-index.
It probably thinks a too large proportion of the ways are "pending" and
therefore decides it is best to do a seq scan. Possibly because before the
"going over pending ways" stage about 50% of ways are pending during the
import. Osm2pgsql should be doing a analyze at the end of the import though,
so I am not sure why this is happening.

I have a planet database that has been imported and has been attempting to catch up for a while now (3 days to go) and it's been showing extreme data throughput spikes when entering the "pending ways" phase, so I decided to check this out. An analyze command show that a sequential scan is, in fact, being done:


gis=> explain select id from planet_osm_ways where pending;
Seq Scan on planet_osm_ways (cost=0.00..5253263.74 rows=65592587 width=4)
   Filter: pending


I'm not sure why, but I got an error attempting to "analyze verbose planet_osm_ways" as www-data, but it seems to execute as the postgres user:

gis=> analyze verbose planet_osm_ways;
WARNING: skipping "planet_osm_ways" --- only table or database owner can analyze it

The analyze output really didn't tell me much:

gis=# analyze verbose planet_osm_ways;
INFO:  analyzing "public.planet_osm_ways"
INFO: "planet_osm_ways": scanned 30000 of 3941416 pages, containing 983808 live rows and 589826 dead rows; 30000 rows in sample, 131170601 estimated total rows

But a subsequent explain certainly looks promising:

gis=> explain select id from planet_osm_ways where pending;
Index Scan using planet_osm_ways_idx on planet_osm_ways (cost=0.00..916513.78 rows=43724 width=4)

Now to wait for my next 12 hour catchup chunk to get to the pending ways phase,probably in about 6 hours or so.

Lynn (D) - KJ4ERJ

PS. If anyone gets this far, are those "dead rows" a function of my running with autovaccuum off? And should I be doing a periodic vaccuum to clean out accumulated cruft? I was thinking that not too much in the osm planet would be deleting, but maybe my assumption is incorrect?


_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to