Re: [OSM-talk] OSMDoc is awesome!

2010-09-16 Thread Peter Körner

Am 16.09.2010 01:56, schrieb Lars Francke:

My number 1 priority though is fresh data but unfortunately stuff
keeps getting in the way and I have very little time at the moment. I
hoped to get something done before October but I'm not sure yet.


In which language is OSMDoc written and is the sourcecode public 
available? Maybe I can get an instance up on the wikimedia toolserver. 
The server has enough free resources for that.


Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-16 Thread Lars Francke
 My number 1 priority though is fresh data but unfortunately stuff
 keeps getting in the way and I have very little time at the moment. I
 hoped to get something done before October but I'm not sure yet.

 In which language is OSMDoc written and is the sourcecode public available?
 Maybe I can get an instance up on the wikimedia toolserver. The server has
 enough free resources for that.

In Python using the Django framework. The sourcecode is not public but
I can easily change that :) Let me know if you're interested in it and
I'll gladly put it online somewhere.

Cheers,
Lars

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-16 Thread Peter Körner

Am 16.09.2010 16:19, schrieb Lars Francke:

My number 1 priority though is fresh data but unfortunately stuff
keeps getting in the way and I have very little time at the moment. I
hoped to get something done before October but I'm not sure yet.


In which language is OSMDoc written and is the sourcecode public available?
Maybe I can get an instance up on the wikimedia toolserver. The server has
enough free resources for that.


In Python using the Django framework. The sourcecode is not public but
I can easily change that :) Let me know if you're interested in it and
I'll gladly put it online somewhere.


I'm not sure how to get it running on the toolserver but I'm sure there 
will be a way. In any way it would be good to have it online somewhere.


The DD-Import-Script is:
http://svn.toolserver.org/svnroot/mazder/experimental_osmdoc_import/

I'm just importing a germany extract into the toolserver postgis db.

Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-15 Thread Lars Francke
Hi Dave,

sorry I missed your mail or forgot to answer. Thanks for the reminder.

 The search facility appears to only search the key tag  not the values tag.
 Is there a way to overcome this?

 For instance - parking. It doesn't list amenity=parking or any other
 *=parking.

the current OSMdoc doesn't have that possibility, no. Unfortunately.
I'd love to see that as well. Time and resources permitting I'd love
to bring back a Solr[1] powered search engine but I'm not sure when
that'll happen.

My number 1 priority though is fresh data but unfortunately stuff
keeps getting in the way and I have very little time at the moment. I
hoped to get something done before October but I'm not sure yet.

Cheers,
Lars

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-09 Thread Peter Körner

Am 03.09.2010 11:09, schrieb Lars Francke:

Wow! That's awesome. Thank you for your work. I'll be back home next
week and will give it a go then.


Did you get anywhere? Do you need something? I could run it against a 
planet file on the wikimedia toolserver, I think, and supply you with a 
compressed database dump, if that helps.


Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-03 Thread Peter Körner

Am 01.09.2010 15:15, schrieb Lars Francke:

The database schema is pretty easy though so if anyone has data laying
around this is what I would need:

tag_keys: id integer, total_count integer, changeset_count integer,
node_count integer, relation_count integer, way_count integer, name
character varying(255), value_count integer

tag_values: id integer, total_count integer, changeset_count integer,
node_count integer, relation_count integer, way_count integer, name
character varying(255), key_id integer


There's one thing I've been missing: the changeset_count. How do you 
calculate it? Is it the number of distinct changesets that have used 
this tag resp. tag/calue combination?


I'd then implement it using another two tables

changeset_keys: changeset integer, key_id integer
changeset_values: changeset integer, key_id integer, value_id integer

to check if a specific key / value is already used in a changeset and 
not incrementing changeset_count then.


Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-03 Thread M∡rtin Koppenhoefer
 Am 01.09.2010 15:15, schrieb Lars Francke:

 The database schema is pretty easy though so if anyone has data laying
 around this is what I would need:

 tag_keys: id integer, total_count integer, changeset_count integer,
 node_count integer, relation_count integer, way_count integer, name
 character varying(255), value_count integer

will there be a way to determine, which / how many different users did
use a certain key/tag?

cheers,
Martin

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-03 Thread Sam Vekemans
hi Peter,
It will be great to see your resaults, then i can compare them with
the current tagging system used by
CanVec/Tiger/linz/garmin/mapnik/cyclemap/josm/potlatch/mapzen/merkaartor
and others.


I found that many tags are 'primary tag dependent', meaning that using
the tag alone on a node/way/area does not directly produce a map
feature.
ie. 'access=*' and 'surface=*' .. and 'name=*user_defined' all require
something else to make a map feature.  In the example you need
'highway=*' to make any of the 3 do something.'
So these tags can be grouped into a category and numbered in a easily
understandable TagID#


I'm still working on this idea, the temporary name is 'Schematroll 2.01' :-)


Cheers,
Sam

On 9/3/10, Peter Körner osm-li...@mazdermind.de wrote:
 Am 01.09.2010 15:15, schrieb Lars Francke:
 The database schema is pretty easy though so if anyone has data laying
 around this is what I would need:

 tag_keys: id integer, total_count integer, changeset_count integer,
 node_count integer, relation_count integer, way_count integer, name
 character varying(255), value_count integer

 tag_values: id integer, total_count integer, changeset_count integer,
 node_count integer, relation_count integer, way_count integer, name
 character varying(255), key_id integer

 There's one thing I've been missing: the changeset_count. How do you
 calculate it? Is it the number of distinct changesets that have used
 this tag resp. tag/calue combination?

 I'd then implement it using another two tables

 changeset_keys: changeset integer, key_id integer
 changeset_values: changeset integer, key_id integer, value_id integer

 to check if a specific key / value is already used in a changeset and
 not incrementing changeset_count then.

 Peter

 ___
 talk mailing list
 talk@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/talk



-- 
Twitter: @Acrosscanada
Blogs: http://acrosscanadatrails.posterous.com/
http://Acrosscanadatrails.blogspot.com
Facebook: http://www.facebook.com/sam.vekemans
Skype: samvekemans
IRC: irc://irc.oftc.net #osm-ca Canadian OSM channel (an open chat room)
@Acrosscanadatrails

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-03 Thread Lars Francke
 The database schema is pretty easy though so if anyone has data laying
 around this is what I would need

 I've put together a script that creates this schema [1]. I used php  expat
 for the xml parsing and pl/pgsql for the counting/update/insert part.

Wow! That's awesome. Thank you for your work. I'll be back home next
week and will give it a go then.

About the changeset count: I did it the same way I did all others. The
planet extract has a dump of all changesets in it (at the very
beginning) and those have tag elements as well so they shouldn't
need any special treatment.

Cheers,
Lars

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-03 Thread Peter Körner

Am 03.09.2010 11:09, schrieb Lars Francke:

The database schema is pretty easy though so if anyone has data laying
around this is what I would need


I've put together a script that creates this schema [1]. I used php  expat
for the xml parsing and pl/pgsql for the counting/update/insert part.


Wow! That's awesome. Thank you for your work. I'll be back home next
week and will give it a go then.

About the changeset count: I did it the same way I did all others. The
planet extract has a dump of all changesets in it (at the very
beginning) and those havetag  elements as well so they shouldn't
need any special treatment.


Ah, I see. The Planet-Extracts I worked with had the changesets stripped 
out, so I haven't thought about that. I'll change the script to import 
the changeset count, too.


Do you have the ability to run it on your server? If not I could maybe 
run it on the wikimedia toolservers and post a bz2 compressed sql dump 
with the results.


Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-03 Thread Peter Körner

Am 03.09.2010 13:23, schrieb Peter Körner:

Ah, I see. The Planet-Extracts I worked with had the changesets stripped
out, so I haven't thought about that. I'll change the script to import
the changeset count, too.


I have implemented the changeset count now. There's no filter capability 
(eg. drop fixme values or don't import changeset-comments) but it would 
be easy to implement it efficiently.


If you need it, just send a mail.

Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


[OSM-talk] OSMDoc is awesome!

2010-09-01 Thread John Harvey
 OSMDoc is great - it's a shame it's a year out of date.  I needed a 
more modern breakdown of tag statistics so I decided to write a report 
myself - very quick and dirty (no where near as cool as OSMDoc), but 
functional to get a breakdown of tag usage.  I figured someone else 
might like to read it too:


http://www.gamesforlittletimmy.com/TagStats_100811.txt.gz
(6.1MB decompressed, 881K download.  The text file is Unix formatted, UTF-8)

Basically the file is a breakdown of the most common tags.  If you want 
to know what is the most common shop=, amenity=, highway = etc, this 
file probably has it.  (I skip most of the private spaces like tiger: , 
ksj2:, naptan: etc).


Now for the fun and useless facts part:

   * name=Hauptstraße is the most common name in the world.  Used
 14,353 times.  (Main Street in German as I understand it)
   * There are 16,691,461 ways, nodes or relations marked with
 building=something.  A year ago that number was 2,367,194 -
 roughly 7x growth.
   * addr:city =San Diego is the most common in the world - 367,229
 times.
   * The top 4 countries by addr:country= are DK, US, DE, CZ.
   * amenity = swimming_pool is used 15,436 times.  It doesn't even
 have a wiki page.  leisure = swimming_pool (which does have a
 wiki page) is used 6,363 times.
   * amenity = place_of_worship is used 327,501 times - 21% growth
 over a year ago.  amenity = parking is used 407,445 times - 81%
 growth over a year ago.
   * The most common street_address= is 9 EDITH BLVD NE - 243
 of them in the world.  (I suspect import issues)
   * There are 315 microbrewery's tagged.  This tag didn't exist last year.
   * The 5 most common operator = are Metro Transit (15,052 times),
 Deutsche Telekom AG (11,226), Deutsche Post AG (9,807),
 Deutsche Post (7061), Royal Mail (5,534)
   * The most common brand is agip - 1907 instances.  Ford is the
 most common car brand with 78 instances.
   * There are more misspellings of denomination=church_of_england
 (118) than there are denomination=shia (81) . 
 denomination=catholic is the most common - 42143 (109% growth in

 a year), but denomination=baptist may be first next year 33,965
 (1700% growth in a year).
   * shop=hairdresser jumped from 3,439 a year ago to 10729 this year
 (there is an icon in Mapnik and osmarender).  shop=furniture grew
 from 1,675 a year ago to 4,570 this year (14th most common shop=,
 neither Mapnik nor osmarender have an icon for furniture shop's).
   * sport=soccer is the most common - 26% of all sport= tags. 
 sport=scuba_diving has 2272 tags now - up 11,200% from last year

 (20).


I'm sure you can find some interesting/useless/funny stats.

John


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Pierre-Alain Dorange
John Harvey j...@johnharveyphoto.com wrote:

   OSMDoc is great - it's a shame it's a year out of date.  I needed a
 more modern breakdown of tag statistics so I decided to write a report
 myself - very quick and dirty (no where near as cool as OSMDoc), but 
 functional to get a breakdown of tag usage.  I figured someone else 
 might like to read it too:

In cas you don't know it, there is tagstat, a little bit slow but more
accurate and with updated data :

http://tagstat.hypercube.telascience.org/
http://wiki.openstreetmap.org/wiki/Tagstat

I usually prefer it over osmdoc because it use updated data.

-- 
Pierre-Alain Dorange


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Jörg Ehrichs
 John Harvey j...@johnharveyphoto.com wrote:
  OSMDoc is great - it's a shame it's a year out of date. ...

Am Mittwoch, 1. September 2010, 13:40:23 schrieb Pierre-Alain Dorange:
 In cas you don't know it, there is tagstat, a little bit slow but more
 accurate and with updated data :
 
 http://tagstat.hypercube.telascience.org/
 http://wiki.openstreetmap.org/wiki/Tagstat

And just to get a full list.
There is also the older, but still working and updated Tagwatch

http://wiki.openstreetmap.org/wiki/Tagwatch
http://tagwatch.stoecker.eu/

Jörg

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Dave F.

 On 01/09/2010 12:56, Jörg Ehrichs wrote:

And just to get a full list.
There is also the older, but still working and updated Tagwatch

http://wiki.openstreetmap.org/wiki/Tagwatch
http://tagwatch.stoecker.eu/


I maybe missing something, but is there a search facility?

Cheers
Dave F.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Lars Francke
 OSMDoc is great - it's a shame it's a year out of date.  I needed a more
 modern breakdown of tag statistics so I decided to write a report myself -
 very quick and dirty (no where near as cool as OSMDoc), but functional to
 get a breakdown of tag usage.  I figured someone else might like to read it
 too:

Well thank you very much. I'm painfully aware of the missing updates
and I'm working on it on and off but I don't have the time to put a
lot of effort into it most of the time. I still hope for fresh data
this month.

The database schema is pretty easy though so if anyone has data laying
around this is what I would need:

tag_keys: id integer, total_count integer, changeset_count integer,
node_count integer, relation_count integer, way_count integer, name
character varying(255), value_count integer

tag_values: id integer, total_count integer, changeset_count integer,
node_count integer, relation_count integer, way_count integer, name
character varying(255), key_id integer

Cheers,
Lars (author of OSMdoc)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Peter Körner

Am 01.09.2010 15:15, schrieb Lars Francke:

OSMDoc is great - it's a shame it's a year out of date.  I needed a more
modern breakdown of tag statistics so I decided to write a report myself -
very quick and dirty (no where near as cool as OSMDoc), but functional to
get a breakdown of tag usage.  I figured someone else might like to read it
too:


Well thank you very much. I'm painfully aware of the missing updates
and I'm working on it on and off but I don't have the time to put a
lot of effort into it most of the time. I still hope for fresh data
this month.
What is the problem in running the import, that you did once, again, 
completely replacing the outdated data.


I know how painful it can be to read .osc files into a database, but a 
simple re-import would help a lot.


Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread M∡rtin Koppenhoefer
2010/9/1 Peter Körner osm-li...@mazdermind.de:
 What is the problem in running the import, that you did once, again,
 completely replacing the outdated data.


he wrote on the German ML that he lost the program which did the import.

cheers,
Martin

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Dave F.

 On 01/09/2010 14:15, Lars Francke wrote:

Well thank you very much.



Cheers,
Lars (author of OSMdoc)


Hi Lars

The search facility appears to only search the key tag  not the values 
tag. Is there a way to overcome this?


For instance - parking. It doesn't list amenity=parking or any other 
*=parking.


Cheers
Dave F.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Lars Francke
 What is the problem in running the import, that you did once, again,
 completely replacing the outdated data.

 he wrote on the German ML that he lost the program which did the import.

Thank you Martin.

He is correct. I don't have the script anymore that did the import and
it didn't work very well either for various reasons. I could probably
just use the thing that tagstats is using but as far as I know that
keeps everything in RAM and I don't have access to a machine that
could run this...but if my current attempt doesn't work I'll
investigate this route.

Cheers,
Lars

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Peter Körner



Am 01.09.2010 15:15, schrieb Lars Francke:

OSMDoc is great - it's a shame it's a year out of date.  I needed a more
modern breakdown of tag statistics so I decided to write a report myself -
very quick and dirty (no where near as cool as OSMDoc), but functional to
get a breakdown of tag usage.  I figured someone else might like to read it
too:


Well thank you very much. I'm painfully aware of the missing updates
and I'm working on it on and off but I don't have the time to put a
lot of effort into it most of the time. I still hope for fresh data
this month.

The database schema is pretty easy though so if anyone has data laying
around this is what I would need


I've put together a script that creates this schema [1]. I used php  
expat for the xml parsing and pl/pgsql for the counting/update/insert part.


I ran it on the heavy loaded german dev-server, gauss, and it runs not 
too fast -- but it runs.


it took 6m50.329s for the germany/bremen extract [2] which is about 4MB 
of bz2 compressed xml (imported 65046 nodes, 125824 ways).


I tested the time consumption in different parts of the process:
bzip2- (decompress to /dev/null): 0m2.506s
bzip2+php (use an empty pl/pgsql function): 0m18.190s
bzip2+php+tags (only the tags part of the pl/pgsql function): 3m50.697s
so it seems that, as expected, the pgsql function takes the most time, 
and the time consumption is balanced between the tag and the key counting.


It may be worth running it on a lesser used server and see how long it 
takes for a greater extract.


Peter

[1] http://svn.toolserver.org/svnroot/mazder/experimental_osmdoc_import/

[2] http://download.geofabrik.de/osm/europe/germany/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSMDoc is awesome!

2010-09-01 Thread Peter Körner



Am 01.09.2010 22:32, schrieb Peter Körner:

it took 6m50.329s for the germany/bremen extract [2] which is about 4MB
of bz2 compressed xml (imported 65046 nodes, 125824 ways).


I just ran it on a much faster server and it took 0m33.193s for the same 
extract. This sounds reasonable. I won't be able to run a planet on it 
as it's a testing environment that will be in use from 9 o' clock 
tomorrow on again but calculating it up the import would run about 22 
hours for a planet.


Peter

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk