Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-21 Thread Lars Francke
 On a related note: There was a change between PostgreSQL 8.3 and 8.4
 which increased the default_statistics_target from 10 to 100 and its
 maximum from 1.000 to 10.000 which makes ANALYZE runs take longer but
 at least for me it helps a lot.

 Have you, too, noticed the 10% performance degradation (of 8.4 vs. 8.3) on
 full planet imports that prompted me to start this thread?

I'm sorry I never measured the time and I had to downgrade to 8.3 for
the time being for other reasons.

Lars

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-20 Thread Martijn van Oosterhout
On Sat, Sep 19, 2009 at 7:42 PM, Jochen Topf joc...@remote.org wrote:
 (I beleive it used to vacuum, but I don't immediately see it in the
 current source).

 I always disabled autovacuum for this kind of database. It tends to always
 come at the wrong time. Its better to do the vacuum once after an import
 run and never in between. Autovacuum is nice for a database where many
 different programs change data all the time. For a OSM Mapnik database where
 there is exactly one program changing data, it should do the vacuum itself.

I remember now. I removed the vacuum when I added support for the
applying of diffs. Quite simply:

1. Doing a vacuum immediately after the import isn't useful, there is
nothing to vacuum by definition.

2. Doing a vacuum after a diff is also useless, since the vacuum will
probably takes much much longer than applying the diff. (This I noted
when appliyng an empty diff took minutes on my machine). If you leave
it in you won't be able to apply diffs fast enough.

What you want is the vacuum to apply say once every few days or so,
which is about what autovacuum will do. What I do do is push it more
in the background so it can't take more than 10MB/s of disk bandwidth.
Given the size of the tables, you can imagine how long it takes, but
that doesn't really matter.

Have a nice day,
-- 
Martijn van Oosterhout klep...@gmail.com http://svana.org/kleptog/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-20 Thread Lars Francke
 1. Doing a vacuum immediately after the import isn't useful, there is
 nothing to vacuum by definition.

While this is true an ANALYZE should still be done to initialize the statistics.

 2. Doing a vacuum after a diff is also useless, since the vacuum will
 probably takes much much longer than applying the diff. (This I noted
 when appliyng an empty diff took minutes on my machine). If you leave
 it in you won't be able to apply diffs fast enough.

I will have to disagree here (and agree with Jochen):
From the documentation[1]:
 1. To recover or reuse disk space occupied by updated or deleted rows.
Which in my opinion _only_ makes sense after a diff as it is the only
time anything is updated or deleted. Depending on the type of diffs
(minute, hour, daily) of course. I import the daily diffs and schedule
a VACUUM ANALYZE after each import.

 2. To update data statistics used by the PostgreSQL query planner.
Very important, too, depending on the type of queries you run. For me
it is a lot of work on the tags tables.
On a related note: There was a change between PostgreSQL 8.3 and 8.4
which increased the default_statistics_target from 10 to 100 and its
maximum from 1.000 to 10.000 which makes ANALYZE runs take longer but
at least for me it helps a lot.

Cheers,
Lars

[1]: http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-20 Thread Martijn van Oosterhout
On Sun, Sep 20, 2009 at 6:16 PM, Lars Francke lars.fran...@gmail.com wrote:
 1. Doing a vacuum immediately after the import isn't useful, there is
 nothing to vacuum by definition.

 While this is true an ANALYZE should still be done to initialize the 
 statistics.

Which it does, but ANALYZE is almost independant of the size of the
table. Certainly much faster than vacuum.

 2. Doing a vacuum after a diff is also useless, since the vacuum will
 probably takes much much longer than applying the diff. (This I noted
 when appliyng an empty diff took minutes on my machine). If you leave
 it in you won't be able to apply diffs fast enough.

 I will have to disagree here (and agree with Jochen):
 From the documentation[1]:

Sorry, I should have been clearer:

2. *Having osm2pgsql do* a vacuum after a diff is also useless, etc

 Which in my opinion _only_ makes sense after a diff as it is the only
 time anything is updated or deleted. Depending on the type of diffs
 (minute, hour, daily) of course. I import the daily diffs and schedule
 a VACUUM ANALYZE after each import.

Ofcourse, you need to do one eventually, but it's not a choice
osm2pgsql can make because it cannot know whether one is needed. It is
not approriate to do one all the time because of the time it takes.
The builtin autovacuum will trigger one appropriately (at least when
about 5% of rows have changed). If you decide to disable that then you
are responsible for your own vacuuming (don't forget the catalogs!).
But in all cases it's not osm2pgsql's responsibility.

 2. To update data statistics used by the PostgreSQL query planner.
 Very important, too, depending on the type of queries you run. For me
 it is a lot of work on the tags tables.

Again, autovacuum does this, otherwise it's your responsibility.
osm2pgsql is not responsible here.

Have a nice day,
-- 
Martijn van Oosterhout klep...@gmail.com http://svana.org/kleptog/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-20 Thread Frederik Ramm
Hi,

Lars Francke wrote:
 On a related note: There was a change between PostgreSQL 8.3 and 8.4
 which increased the default_statistics_target from 10 to 100 and its
 maximum from 1.000 to 10.000 which makes ANALYZE runs take longer but
 at least for me it helps a lot.

Have you, too, noticed the 10% performance degradation (of 8.4 vs. 8.3) 
on full planet imports that prompted me to start this thread?

Bye
Frederik


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-20 Thread Martijn van Oosterhout
On Sun, Sep 20, 2009 at 7:43 PM, Frederik Ramm frede...@remote.org wrote:
 Hi,

 Lars Francke wrote:
 On a related note: There was a change between PostgreSQL 8.3 and 8.4
 which increased the default_statistics_target from 10 to 100 and its
 maximum from 1.000 to 10.000 which makes ANALYZE runs take longer but
 at least for me it helps a lot.

 Have you, too, noticed the 10% performance degradation (of 8.4 vs. 8.3)
 on full planet imports that prompted me to start this thread?

Just a few ideas:

1. If you list the ctid for the first few rows, perhaps you can spot
if there are maybe more tuples fitting in a page?

2. Is the data for the arrays being TOASTed (pushed out to a seperate
table). It's possible the compression threshold has changed meaning
they get compressed in 8.4, which saves space but costs compression
time.  To test this you need to find an array with lots of entries (a
few hundred at least). You can set the storage options manually if you
want to test.

3. In 8.4 the VACUUM should be a lot faster (maybe not first time, but
subsequent runs) due to the visibility map. Could you test this?

4. As pointed out, the statistics target changed, might be worth
measuring the effect of that explicitly (either manually, of using the
log_statement_duration option).

5. The figures you paste for table size, are they including the TOAST
tables? (as in pg_relation_size vs pg_total_relation_size).

Hope this helps,
-- 
Martijn van Oosterhout klep...@gmail.com http://svana.org/kleptog/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-19 Thread Martijn van Oosterhout
On Wed, Sep 16, 2009 at 9:39 PM, Martin Lesser ml-osm-...@bettercom.de wrote:
 Can anybody confirm this measurement, or even suggest what the reason
 might be?

 Another pointer: How many VACUUMs does osm2pgsql trigger during import
 and filling the own tables?

None, AFAICS from the source. That's what autovacuum is for. It does
analyse each table.

(I beleive it used to vacuum, but I don't immediately see it in the
current source).

If you want timestamps, turn on log_min_duration_statement on your
server and check the logs for long running statements.

Have a nice day,
-- 
Martijn van Oosterhout klep...@gmail.com http://svana.org/kleptog/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-19 Thread Jochen Topf
On Sat, Sep 19, 2009 at 07:12:14PM +0200, Martijn van Oosterhout wrote:
 On Wed, Sep 16, 2009 at 9:39 PM, Martin Lesser ml-osm-...@bettercom.de 
 wrote:
  Can anybody confirm this measurement, or even suggest what the reason
  might be?
 
  Another pointer: How many VACUUMs does osm2pgsql trigger during import
  and filling the own tables?
 
 None, AFAICS from the source. That's what autovacuum is for. It does
 analyse each table.
 
 (I beleive it used to vacuum, but I don't immediately see it in the
 current source).

I always disabled autovacuum for this kind of database. It tends to always
come at the wrong time. Its better to do the vacuum once after an import
run and never in between. Autovacuum is nice for a database where many
different programs change data all the time. For a OSM Mapnik database where
there is exactly one program changing data, it should do the vacuum itself.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-16 Thread Martin Lesser
Frederik Ramm frede...@remote.org writes:

 I did a full (osm2pgsql) planet import on a standard Ubuntu Jaunty 
 system and it took 1794 minutes. Then I upgraded to Postgres 8.4 
 (backported from Karmic) plus PostGIS 1.4 (home-built package), and 
 re-tried the import: 2028 minutes (that's 13% performance loss).

 Can anybody confirm this measurement, or even suggest what the reason 
 might be?

Another pointer: How many VACUUMs does osm2pgsql trigger during import
and filling the own tables?

I also registered some slowing down with my own application when it
comes to VACUUM. And I also read some mailings about this issue on the
PG-MLs. AFAIR osm2pgsql does not timestamp its steps so it is probably a
little difficult to hunt the real timeconsumer down.

HTH, Martin

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-15 Thread Frederik Ramm
Hi,

(taking this back onto dev)

Jon Burgess wrote:
 It sounds odd that they would be quite so wrong. The osm2pgsql code
 performs an ANALYZE as part of the processing. 

Something must be wrong there then. The osm2pgsql import with 8.3 has 
now completed, again needing 1800 minutes compared to the 2000 needed by 
the 8.4 version. The pg_class table, again, listed the nodes as having 
relpages=0. Then I did a simple analyze which took a minute or two, 
and afterwards I got sensible results:

For 8.3 with osm2pgsql 0.67, slim mode and -l:

  relname  |size| reltuples | relpages
--++---+--
  planet_osm_ways_nodes| 20 GB  |  32636734 |  2577003
  planet_osm_nodes | 18 GB  | 420728864 |  2333748
  planet_osm_nodes_pkey| 9007 MB| 420728864 |  1152902
  planet_osm_line  | 7105 MB|  27390248 |   909401
  planet_osm_ways  | 6373 MB|  32636734 |   815762
  planet_osm_line_index| 1613 MB|  27390248 |   206460
  planet_osm_polygon   | 1248 MB|   4276819 |   159700
  planet_osm_ways_pkey | 1002 MB|  32636734 |   128192
  planet_osm_roads | 800 MB |   2496160 |   102457
  planet_osm_point | 615 MB |   7954021 |78728
  planet_osm_line_pkey | 587 MB |  27390248 |75093
  planet_osm_point_index   | 464 MB |   7954021 |59423
  planet_osm_polygon_index | 260 MB |   4276819 |7
  planet_osm_point_pkey| 171 MB |   7954021 |21825
  planet_osm_roads_index   | 151 MB |   2496160 |19373
  planet_osm_ways_idx  | 113 MB | 0 |14499
  planet_osm_polygon_pkey  | 92 MB  |   4276819 |11735
  planet_osm_rels_parts| 85 MB  |171435 |10904
  planet_osm_rels  | 84 MB  |171435 |10727
  planet_osm_roads_pkey| 54 MB  |   2496160 | 6929
  planet_osm_rels_pkey | 3784 kB|171435 |  473
  planet_osm_rels_idx  | 8192 bytes | 0 |1
(22 rows)

For 8.4 with the same parameters (this time after analyze):

  relname  |size| reltuples | relpages
--++---+--
  planet_osm_ways_nodes| 18 GB  |  32352920 |  2406878
  planet_osm_nodes | 18 GB  | 420052384 |  2333748
  planet_osm_nodes_pkey| 9007 MB| 420052384 |  1152902
  planet_osm_line  | 7104 MB|  27426062 |   909316
  planet_osm_ways  | 6364 MB|  32352920 |   814552
  planet_osm_line_index| 1607 MB|  27426062 |   205720
  planet_osm_polygon   | 1247 MB|   4286884 |   159647
  planet_osm_ways_pkey | 1001 MB|  32352920 |   128190
  planet_osm_roads | 800 MB |   2525946 |   102446
  planet_osm_point | 615 MB |   7954690 |78728
  planet_osm_line_pkey | 587 MB |  27426062 |75093
  planet_osm_point_index   | 464 MB |   7954690 |59423
  planet_osm_polygon_index | 260 MB |   4286884 |7
  planet_osm_point_pkey| 171 MB |   7954690 |21825
  planet_osm_roads_index   | 151 MB |   2525946 |19373
  planet_osm_ways_idx  | 113 MB | 0 |14499
  planet_osm_polygon_pkey  | 92 MB  |   4286884 |11735
  planet_osm_rels  | 84 MB  |171433 |10727
  planet_osm_rels_parts| 67 MB  |171433 | 8545
  planet_osm_roads_pkey| 54 MB  |   2525946 | 6929
  planet_osm_rels_pkey | 3784 kB|171433 |  473
  planet_osm_rels_idx  | 8192 bytes | 0 |1
(22 rows)

Now the two tables create more questions than they answer. Why does the 
node table under 8.4 seem to have less entries than under 8.3 when both 
have been populated with the same planet file?

osm=# select max(id) from planet_osm_nodes;
 max
---
  480684210

same for both databases.

osm=# select count(*) from planet_osm_nodes;
count
---
  420480680

also the same for both. Which means that 8.3 over- and 8.4 
under-estimates the actual number in the pg_class listing.

Maybe the 18 GB versus 20 GB is just an estimate as well because I 
cannot seem to find a difference in node storage size:

osm=# select id,pg_column_size(planet_osm_nodes.*) from planet_osm_nodes 
where  id in (1,2,3,4,5,49786136,49786237,50052568,51308246,52779935);
 id| pg_column_size
--+
 2 |109
 3 | 36
 4 | 36
 5 | 36
  49786136 |236
  49786237 |240
  50052568 |276
  51308246 |208
  52779935 |204

Same for 8.3 and 8.4.

osm=# select sum(pg_column_size(planet_osm_nodes.*)) from 
planet_osm_nodes where id  4;
 sum

  2622633489

Same too.

But then the disk usage for 8.3:

# du  8.3/main
328 8.3/main/global
4   

Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-15 Thread Tom Hughes
On 15/09/09 10:42, Frederik Ramm wrote:

 Now the two tables create more questions than they answer. Why does the
 node table under 8.4 seem to have less entries than under 8.3 when both
 have been populated with the same planet file?

An analyse without vacuum only generates an estimate for reltuples. See 
the doco at http://www.postgresql.org/docs/8.4/static/planner-stats.html 
which says:

For efficiency reasons, reltuples and relpages are not updated 
on-the-fly, and so they usually contain somewhat out-of-date values. 
They are updated by VACUUM, ANALYZE, and a few DDL commands such as 
CREATE INDEX. A stand-alone ANALYZE, that is one not part of VACUUM, 
generates an approximate reltuples value since it does not read every 
row of the table. The planner will scale the values it finds in pg_class 
to match the current physical table size, thus obtaining a closer 
approximation.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-13 Thread Jon Burgess
On Sun, 2009-09-13 at 19:56 +0200, Frederik Ramm wrote:
 Hi,
 
 I did a full (osm2pgsql) planet import on a standard Ubuntu Jaunty 
 system and it took 1794 minutes. Then I upgraded to Postgres 8.4 
 (backported from Karmic) plus PostGIS 1.4 (home-built package), and 
 re-tried the import: 2028 minutes (that's 13% performance loss).
 
 Can anybody confirm this measurement, or even suggest what the reason 
 might be? I used the same postgres.conf for both databases. I noted that 
 Postgres 8.4 produced 25% more data in the pg_xlog directory than 8.3 
 did, could that hint at the reason for the difference?

Try running the query below and compare the sizes returned for the
tables  indexes on the two databases.

gis= SELECT relname, pg_size_pretty(relpages::bigint * 8 *1024) AS
size, CASE WHEN relkind = 't' THEN (SELECT pgd.relname FROM pg_class pgd
WHERE pgd.relfilenode::text = SUBSTRING(pg.relname FROM 10)) ELSE
(SELECT pgc.relname FROM pg_class pgc WHERE pg.reltoastrelid =
pgc.relfilenode) END AS refrelname, relfilenode, relkind,
reltuples::bigint, relpages FROM pg_class pg ORDER BY relpages DESC;


The largest table is normally planet_osm_nodes. With the latest
osm2pgsql code  postgresql-8.3.7 the size of each row is typically 36
bytes (more if the node has tags):

gis= select pg_column_size(planet_osm_nodes.*) from planet_osm_nodes
limit 10;
 pg_column_size

109
 36
 36
 36
 36
 36
 36
 36
101
 36
(10 rows)

This is with the latest osm2pgsql code which stores the node lat/long as
32 bit ints. This is a recent change, until about a month ago it was
storing them as doubles and needed 48 bytes per row.

Jon



___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-13 Thread Frederik Ramm
Hi,

Jon Burgess wrote:
 Try running the query below and compare the sizes returned for the
 tables  indexes on the two databases.

Hm, it seems I have overlooked the fact that I also updated osm2pgsql 
when I did the update. The old database was created with 0.66. But it 
already had integers in the node lat/lon columns. Funny thing is, your 
query does not list the nodes table although it definitely is there and 
has data:

Postgres 8.4/PostGIS 1.4/osm2pgsql 0.67:

   relname   |size| reltuples | relpages
++---+--
  planet_osm_ways_nodes  | 18 GB  | 455575744 |  2406878
  planet_osm_line| 7104 MB|  27410148 |   909316
  planet_osm_ways| 6364 MB|  32382898 |   814552
  planet_osm_line_index  | 1607 MB|  27410148 |   205720
  planet_osm_polygon | 1247 MB|   4274538 |   159647
  planet_osm_roads   | 800 MB |   2527240 |   102446
  planet_osm_point   | 615 MB |   7963185 |78728
  planet_osm_line_pkey   | 587 MB |  27410148 |75093
  planet_osm_point_index | 464 MB |   7963185 |59423
  planet_osm_polygon_index   | 260 MB |   4274538 |7
  planet_osm_point_pkey  | 171 MB |   7963185 |21825
  planet_osm_roads_index | 151 MB |   2527240 |19373
  planet_osm_polygon_pkey| 92 MB  |   4274538 |11735
  planet_osm_rels| 84 MB  |171433 |10727
  planet_osm_rels_parts  | 67 MB  |   2058555 | 8545
  planet_osm_roads_pkey  | 54 MB  |   2527240 | 6929

Postgres 8.3/PostGIS 1.3.3/osm2pgsql 0.66:

   relname  |size| reltuples | relpages
---++---+--
  planet_osm_ways_nodes | 20 GB  | 455575744 |  2577003
  planet_osm_line   | 7105 MB|  27520596 |   909401
  planet_osm_ways   | 6373 MB|  32382898 |   815762
  planet_osm_line_index | 1613 MB|  27520596 |   206460
  planet_osm_polygon| 1248 MB|   4323558 |   159700
  planet_osm_roads  | 800 MB |   2525497 |   102457
  planet_osm_point  | 615 MB |   7956908 |78728
  planet_osm_line_pkey  | 587 MB |  27520596 |75093
  planet_osm_point_index| 464 MB |   7956908 |59423
  planet_osm_polygon_index  | 260 MB |   4323558 |7
  planet_osm_point_pkey | 171 MB |   7956908 |21825
  planet_osm_roads_index| 151 MB |   2525497 |19373
  planet_osm_polygon_pkey   | 92 MB  |   4323558 |11735
  planet_osm_rels_parts | 85 MB  |   2058555 |10904
  planet_osm_rels   | 84 MB  |171433 |10727
  planet_osm_roads_pkey | 54 MB  |   2525497 | 6929


I double checked that lat/lon is stored as integers on both (36 bytes 
per record is the norm on both).

I'll re-run the full import with the exact same osm2pgsql version to 
have perfect conditions for comparison. - Maybe osm2pgsql could create a 
table where it records its version number and possibly also information 
about the style file used, so one can track these things more easily.

I'll report back in about 2000 minutes then ;-)

Bye
Frederik


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-13 Thread Jon Burgess
On Sun, 2009-09-13 at 22:45 +0200, Frederik Ramm wrote:
 Hi,
 
 Jon Burgess wrote:
  Try running the query below and compare the sizes returned for the
  tables  indexes on the two databases.
 
 Hm, it seems I have overlooked the fact that I also updated osm2pgsql 
 when I did the update. The old database was created with 0.66. But it 
 already had integers in the node lat/lon columns. Funny thing is, your 
 query does not list the nodes table although it definitely is there and 
 has data:
 
 Postgres 8.4/PostGIS 1.4/osm2pgsql 0.67:
 
relname   |size| reltuples | relpages
 ++---+--
   planet_osm_ways_nodes  | 18 GB  | 455575744 |  2406878
   planet_osm_line| 7104 MB|  27410148 |   909316
   planet_osm_ways| 6364 MB|  32382898 |   814552
   planet_osm_line_index  | 1607 MB|  27410148 |   205720
   planet_osm_polygon | 1247 MB|   4274538 |   159647
   planet_osm_roads   | 800 MB |   2527240 |   102446
   planet_osm_point   | 615 MB |   7963185 |78728
   planet_osm_line_pkey   | 587 MB |  27410148 |75093
   planet_osm_point_index | 464 MB |   7963185 |59423
   planet_osm_polygon_index   | 260 MB |   4274538 |7
   planet_osm_point_pkey  | 171 MB |   7963185 |21825
   planet_osm_roads_index | 151 MB |   2527240 |19373
   planet_osm_polygon_pkey| 92 MB  |   4274538 |11735
   planet_osm_rels| 84 MB  |171433 |10727
   planet_osm_rels_parts  | 67 MB  |   2058555 | 8545
   planet_osm_roads_pkey  | 54 MB  |   2527240 | 6929
 
 Postgres 8.3/PostGIS 1.3.3/osm2pgsql 0.66:
 
relname  |size| reltuples | relpages
 ---++---+--
   planet_osm_ways_nodes | 20 GB  | 455575744 |  2577003
   planet_osm_line   | 7105 MB|  27520596 |   909401
   planet_osm_ways   | 6373 MB|  32382898 |   815762
   planet_osm_line_index | 1613 MB|  27520596 |   206460
   planet_osm_polygon| 1248 MB|   4323558 |   159700
   planet_osm_roads  | 800 MB |   2525497 |   102457
   planet_osm_point  | 615 MB |   7956908 |78728
   planet_osm_line_pkey  | 587 MB |  27520596 |75093
   planet_osm_point_index| 464 MB |   7956908 |59423
   planet_osm_polygon_index  | 260 MB |   4323558 |7
   planet_osm_point_pkey | 171 MB |   7956908 |21825
   planet_osm_roads_index| 151 MB |   2525497 |19373
   planet_osm_polygon_pkey   | 92 MB  |   4323558 |11735
   planet_osm_rels_parts | 85 MB  |   2058555 |10904
   planet_osm_rels   | 84 MB  |171433 |10727
   planet_osm_roads_pkey | 54 MB  |   2525497 | 6929
 
 
 I double checked that lat/lon is stored as integers on both (36 bytes 
 per record is the norm on both).
 
 I'll re-run the full import with the exact same osm2pgsql version to 
 have perfect conditions for comparison. - Maybe osm2pgsql could create a 
 table where it records its version number and possibly also information 
 about the style file used, so one can track these things more easily.
 
 I'll report back in about 2000 minutes then ;-)

Do you have the _int.sql loaded?

Did you move the nodes to a different table space as part of the
performance tests?

This is what I get...

gis= SELECT relname, pg_size_pretty(relpages::bigint * 8 *1024) AS
size, reltuples::bigint, relpages FROM pg_class pg where relname like
'planet_osm_%' ORDER BY relpages DESC; 

 relname  |  size   | reltuples | relpages
--+-+---+--
 planet_osm_ways_nodes| 21 GB   |  32934624 |  2738983
 planet_osm_nodes | 18 GB   | 421706720 |  2333748
 planet_osm_nodes_pkey| 9007 MB | 421706720 |  1152902
 planet_osm_line  | 8146 MB |  27697808 |  1042707
 planet_osm_ways  | 6814 MB |  32934624 |   872156
 planet_osm_line_index| 1780 MB |  27703332 |   227880
 planet_osm_polygon   | 1363 MB |   4392675 |   174451
 planet_osm_ways_pkey | 1178 MB |  32934624 |   150754
 planet_osm_roads | 931 MB  |   2553120 |   119107
 planet_osm_line_pkey | 706 MB  |  27697808 |90312
 planet_osm_point | 615 MB  |   7949091 |78727
 planet_osm_point_index   | 464 MB  |   7949091 |59434
 planet_osm_polygon_index | 271 MB  |   4392675 |34681
 planet_osm_point_pkey| 171 MB  |   7949091 |21825
 planet_osm_ways_idx  | 170 MB  | 

Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-13 Thread Frederik Ramm
Hi,

I think I made one more blunder when doing the comparison: I did one 
of the imports - the slower one actually! - with the -l flag, the other 
one without. (Both were with --slim.) But by now there are too many 
variables in the game and it is impossible to tell what is to blame for 
the longer running time. I'm now running a new import on postgres 8.3 
with osm2pgsql 0.67 and the -l switch on so that I have two test cases 
which are identical except in the Postgres/PostGIS version.

Jon Burgess wrote:
 Do you have the _int.sql loaded?

Yes, in both versions; however the _int that comes with PostGIS 1.4 
seems to be slightly modified, with new paramters added to 
ginint4_queryextract and ginint4_consistent. So it is not impossible 
that it performs worse.

 Did you move the nodes to a different table space as part of the
 performance tests?

No, *those* performance tests were done on a different machine (with 
identical hardware). This is a pristine installation.

 gis= SELECT relname, pg_size_pretty(relpages::bigint * 8 *1024) AS
 size, reltuples::bigint, relpages FROM pg_class pg where relname like
 'planet_osm_%' ORDER BY relpages DESC; 
 
  relname  |  size   | reltuples | relpages
 --+-+---+--
  planet_osm_ways_nodes| 21 GB   |  32934624 |  2738983
  planet_osm_nodes | 18 GB   | 421706720 |  2333748
  planet_osm_nodes_pkey| 9007 MB | 421706720 |  1152902
[...]
 (22 rows)

Something must be definitely broken here - my ways_nodes table lists 455 
million entries, yours only 32 million. As if the size counters were 
mixed up in some way, and all my nodes counted towards my ways_nodes 
number or something.

I too get 22 rows but the node tables seem to be empty:

[...]
  planet_osm_nodes_pkey| 8192 bytes | 0 |1
  planet_osm_ways_idx  | 8192 bytes | 0 |1
  planet_osm_ways_pkey | 8192 bytes | 0 |1
  planet_osm_rels_idx  | 8192 bytes | 0 |1
  planet_osm_rels_pkey | 8192 bytes | 0 |1
  planet_osm_nodes | 0 bytes| 0 |0

Still, the data is there:

osm=# select max(id) from planet_osm_nodes;
 max
---
  480684210
(1 row)

osm=# select * from planet_osm_nodes limit 10;
  id |lat|lon| tags
+---+---+---
   2 | 501359444 |  83013034 | {place,village,name,Wiesbaden-Naurod}
   3 | 515284576 |  -1486064 |
   4 | 515220583 |  -1457854 |
   5 | 515288506 |  -1464925 |
   6 | 515288619 |  -1465242 |
   7 | 515287437 |  -148 |
   8 | 515296860 |  -1464353 |
   9 | 515284843 |  -1486444 |
  10 | 515289383 |  -1413791 |
  11 |  40840331 | 735129514 | {place,village,name,Embudu}

I would say something is broken in my Postgres installation but these 
results are exactly the same whether I start up the 8.3 or 8.4 version.

Selecting reltablespace from the pg_class table yields 0 for all 
planet_osm tables.

Bye
Frederik


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Postgres 8.4/PostGIS 1.4 performance

2009-09-13 Thread Martin Lesser
Frederik Ramm frede...@remote.org writes:

 Jon Burgess wrote:
 Do you have the _int.sql loaded?

 Yes, in both versions; however the _int that comes with PostGIS 1.4 
 seems to be slightly modified,

Has postgis its own _int.sql? AFAIK _int.sql is part of
postgres-contrib, in the postgis src I did not find it.

And it looks like that one may run different versions of postgres (here:
8.4.0) and pg-contrib (8.3.7) because contrib mostly are only
additional functions.

So to make things a little more complicated you could also re-run your
tests with a changed version of postgresql-contrib :-)

Martin

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev