Re: [OSM-dev] osm2pgsql hstore support

2010-03-20 Thread Sven Geggus
Sven Geggus  wrote:

> Currently the patch has not been tested in conjunction with
> incemental database updates.

OK, in the meantime I can confirm, that incemental database updates
work fine. I just fixed a problem with polygons (SVN r20565). When
using hstores we need a way to mark polygon tags as such but without
adding an additional column for this tag.

This is now possible using the phstore flag in the stylefile. In
non-hstore-mode this flag is invalid.

Regards

Sven

-- 
The main thing to note is that when you choose open source you don't
get a Windows operating system.
  (from http://www.dell.com/ubuntu)
/me is gig...@ircnet, http://sven.gegg.us/ on the Web

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] osm2pgsql hstore support

2010-03-16 Thread Peter Körner
Very cool this. What's the speed to access the hstore column, compared 
to the generic columns we now have?


I did some benchmarks and the answer is - sometimes :)

Without any indexes using a seqscan hstore is slower (see 
hstore-vs-columns-noindex.txt)


hstore does not support (at least the docs does not explicitly name) 
equality-indexes but they can handle has-key indexes using GIN

or GiST [1]. You can see my results in hstore-vs-columns-index.txt:

A "not .. is null" or "(tags ? '...')" query is faster on the hstore 
table because it's using the index.


The ".. = '..'" or "(tags->'..'= '..')" query is faster on the btree 
index of the regular column.


All tests have been made on a berlin extract and thus on a relatively 
small dataset. The size of the index is also interesting - the 
hstore-index (on all possible tags!) is only 4.5 times larger then the 
index on the amenity column (only one tag):


hstore-rendering=> SELECT relname, reltuples, pg_size_pretty(relpages * 
8 * 1024) as size FROM pg_class ORDER BY relname;


 berlin_osm_dual_point_amenity |   50284 | 1192 kB
 berlin_osm_dual_point_tags|   50284 | 5440 kB




In a conclusion I'd say that the performance of the hstore is good 
enough for rendering, when a lot of the queries are "(tags ? '...')" 
queries and the filtering is later done by mapnik. This is ok for 
single-machine setups but may fail if the db-server and the rendering 
server(s) are located on different machines, as more data needs to be 
transferred via LAN (Florian may be able to tell sth. about this).


Where it perfectly fit's in, is the rendering of special maps like the 
translates maps of wikimedia or a overlay-map of all surveillance 
cameras or such things.


Thanks to Sven Geggus, who developed and checked in this patch for 
osm2pgsql.


Peter



[1] 
hstore-rendering=> explain analyze select count(*) from berlin_osm_dual_point 
where not amenity is null;
   QUERY PLAN
-
 Aggregate  (cost=1544.99..1545.00 rows=1 width=0) (actual time=24.222..24.223 
rows=1 loops=1)
   ->  Seq Scan on berlin_osm_dual_point  (cost=0.00..1516.84 rows=11257 
width=0) (actual time=0.022..20.606 rows=11528 loops=1)
 Filter: (NOT (amenity IS NULL))
 Total runtime: 24.330 ms
(4 Zeilen)

hstore-rendering=> explain analyze select count(*) from berlin_osm_dual_point 
where (tags ? 'amenity');
  QUERY PLAN
--
 Aggregate  (cost=1642.68..1642.69 rows=1 width=0) (actual time=55.125..55.125 
rows=1 loops=1)
   ->  Seq Scan on berlin_osm_dual_point  (cost=0.00..1642.55 rows=50 width=0) 
(actual time=0.036..52.040 rows=11528 loops=1)
 Filter: (tags ? 'amenity'::text)
 Total runtime: 55.229 ms
(4 Zeilen)


hstore-rendering=> explain analyze select count(*) from berlin_osm_dual_point 
where amenity = 'restaurant';
  QUERY PLAN
---
 Aggregate  (cost=1645.41..1645.42 rows=1 width=0) (actual time=24.642..24.643 
rows=1 loops=1)
   ->  Seq Scan on berlin_osm_dual_point  (cost=0.00..1642.55 rows=1143 
width=0) (actual time=0.062..24.153 rows=1221 loops=1)
 Filter: (amenity = 'restaurant'::text)
 Total runtime: 24.753 ms
(4 Zeilen)


hstore-rendering=> explain analyze select count(*) from berlin_osm_dual_point 
where (tags->'amenity' = 'restaurant');
  QUERY PLAN
--
 Aggregate  (cost=1768.89..1768.90 rows=1 width=0) (actual time=61.786..61.786 
rows=1 loops=1)
   ->  Seq Scan on berlin_osm_dual_point  (cost=0.00..1768.26 rows=251 width=0) 
(actual time=0.161..61.329 rows=1221 loops=1)
 Filter: ((tags -> 'amenity'::text) = 'restaurant'::text)
 Total runtime: 61.892 ms
(4 Zeilen)
hstore-rendering=> create index berlin_osm_dual_point_tags on 
berlin_osm_dual_point using gin (tags);
hstore-rendering=> alter table berlin_osm_dual_point alter column tags set 
statistics 250;
hstore-rendering=> analyze berlin_osm_dual_point;
hstore-rendering=>
hstore-rendering=> create index berlin_osm_dual_point_amenity on 
berlin_osm_dual_point using btree (amenity);
hstore-rendering=> alter table berlin_osm_dual_point alter column amenity set 
statistics 250;
hstore-rendering=> analyze berlin_osm_dual_point;
hstore-rendering=>

hstore-rendering=> explain analyze select count(*) from berlin_osm_dual_point 
w

Re: [OSM-dev] osm2pgsql hstore support

2010-03-16 Thread Jukka Rahkonen
Sven Geggus  fuchsschwanzdomain.de> writes:

> 
> Hello,
> 
> I just commited a patch for osm2pgsl for optional generation of a
> hstore column (hstore new).

Hi,

Sounds great. I had been thinkin about splitting geometries and tags to a pairs
of tables (one with columns geometry and osm_id, another columns osm_id, key and
value) but the hstore option seems to offer also an indexed access to all the
tags. Obviously solution is not portable to Spatialite database, though.

Any hope that somebody would compile a new osm2pgsql Windows version including
the hstore option?

-Jukka Rahkonen-


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] osm2pgsql hstore support

2010-03-14 Thread Peter Körner
[1> On the one hand, this has the potential to greatly increase the size of
> the db, as you will now get all kinds of tags you're never interested
> in. On the other hand, one can now quickly use keys one never used
> before, without the need to reimport the whole db.

When you just want to render a specific type of map you won't need it 
and there's no reason in not using it. But if you're setting up a 
dev-server like the wikimedia toolserver or the osm dev server, you got 
a whole bunch of people with a lot of ideas.

Atm you'll need to tell them "wait a month or two, i'll add your column 
at the next re-import" Using hstore they just can start working on 
whatever they want.

Another use case lies in "under construction" tags like those used by 
Lulu-Ann for the "OSM for the blind"-Project [1].

Peter

[1] http://wiki.openstreetmap.org/wiki/OSM_for_the_blind

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] osm2pgsql hstore support

2010-03-14 Thread Sven Geggus
Lennard  wrote:

> Very cool this. What's the speed to access the hstore column, compared 
> to the generic columns we now have?

I did not yet try to measure this.

> On the one hand, this has the potential to greatly increase the size of 
> the db, as you will now get all kinds of tags you're never interested 
> in.

Shure, but unwanted tags can already get removed using the delete flag in
the stylefile. While testing I already removed note and source tags
while importing data.

> On the other hand, one can now quickly use keys one never used 
> before, without the need to reimport the whole db.

Thats the thing I was up to: Enabling the generation of special
interest maps with tags somebody invented just recently.

Sven

-- 
Um Kontrolle Ihres Kontos wiederzugewinnen, klicken Sie bitte auf das
Verbindungsgebrüll. (aus einer Ebay fishing Mail)

/me is gig...@ircnet, http://sven.gegg.us/ on the Web

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] osm2pgsql hstore support

2010-03-14 Thread Lennard
Sven Geggus wrote:

> Now we would just need something like this:
> 
> select way,(tags->'man_made') as man_made FROM planet_osm_point where 
> ((tags->'man_made') is not NULL);
> 
> What am I getting wrong here?

Nothing. This will work. Mapnik just expects the name used in a filter 
to correspond with a column name in the resultset, nothing fancy about 
that. The 'as man_made' takes care of this.

Very cool this. What's the speed to access the hstore column, compared 
to the generic columns we now have?

On the one hand, this has the potential to greatly increase the size of 
the db, as you will now get all kinds of tags you're never interested 
in. On the other hand, one can now quickly use keys one never used 
before, without the need to reimport the whole db.

-- 
Lennard

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] osm2pgsql hstore support

2010-03-14 Thread Sven Geggus
Martijn van Oosterhout schrieb am Sonntag, den 14. März um 17:06 Uhr:

> Very cool. Now we just need to find a way to use it. I thought about
> it earlier but I didn't see a way to make mapnik use it so left it.

This won't work?!

I didn't bother with this yet, but I just thougt that all I would
need to do is replacing SQL statements.

Currently we have something like this:

select way,man_made FROM planet_osm_point;

Now we would just need something like this:

select way,(tags->'man_made') as man_made FROM planet_osm_point where 
((tags->'man_made') is not NULL);

What am I getting wrong here?

Regards

Sven

-- 
Kernel panic: I have no root and I want to scream
(Linux Kernel Error Message)

/me is gig...@ircnet, http://sven.gegg.us/ on the Web

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] osm2pgsql hstore support

2010-03-14 Thread Martijn van Oosterhout
Very cool. Now we just need to find a way to use it. I thought about
it earlier but I didn't see a way to make mapnik use it so left it.

Have a nice day,

On Sun, Mar 14, 2010 at 4:47 PM, Sven Geggus
 wrote:
> Hello,
>
> I just commited a patch for osm2pgsl for optional generation of a
> hstore column (hstore new).
>
> For those of you who don't know about hstore colums yet:
>
> Hstore is for sets of key/value pairs. As associative array datatype,
> just like a hash in perl or dictionary in python.
>
> This should come in handy especialy for rarely used tags.
>
> Using hstore one can use any tag in sql queries like this:
>
> gis=> select count(*) FROM planet_osm_point where ((tags->'man_made') = 
> 'tower');
>  count
> ---
>   447
> (1 Zeile)
>
> Indexes can be added to hstores as well.
>
> Currently the patch has not been tested in conjunction with
> incemental database updates.
>
> Regards
>
> Sven
>
>
> --
> "Those who do not understand Unix are condemned to reinvent it, poorly"
> (Henry Spencer)
>
> /me is gig...@ircnet, http://sven.gegg.us/ on the Web
>
> ___
> dev mailing list
> dev@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>



-- 
Martijn van Oosterhout  http://svana.org/kleptog/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


[OSM-dev] osm2pgsql hstore support

2010-03-14 Thread Sven Geggus
Hello,

I just commited a patch for osm2pgsl for optional generation of a
hstore column (hstore new).

For those of you who don't know about hstore colums yet:

Hstore is for sets of key/value pairs. As associative array datatype,
just like a hash in perl or dictionary in python.

This should come in handy especialy for rarely used tags.

Using hstore one can use any tag in sql queries like this:

gis=> select count(*) FROM planet_osm_point where ((tags->'man_made') = 
'tower');
 count 
---
   447
(1 Zeile)

Indexes can be added to hstores as well.

Currently the patch has not been tested in conjunction with
incemental database updates.

Regards

Sven


-- 
"Those who do not understand Unix are condemned to reinvent it, poorly"
(Henry Spencer)

/me is gig...@ircnet, http://sven.gegg.us/ on the Web

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev