Postade precis följande till import-epostlistan:

Örebro municipality of Sweden release GIS-data as CC0. 

This data can be harvested and post processed to produce a couple of hundred 
thousand nodes with a couple of class tag values:

name, place:halmet,
addr:city, addr:place, addr:street, addr:housenumber, 
name, addr:city, addr:place, highway=road
name, amenity=school, isced:level,
name, amenity=social_facility, social_facility= assisted_living, 
social_facility:for,
name, leisure=park
etc.

Output is one osm.xml-file per class. 

https://github.com/OpenStreetMap-Sverige/import-orebro-osm-xml/tree/master/osm.xml
 
<https://github.com/OpenStreetMap-Sverige/import-orebro-osm-xml/tree/master/osm.xml>
https://github.com/OpenStreetMap-Sverige/import-orebro-osm-xml/archive/master.zip
 
<https://github.com/OpenStreetMap-Sverige/import-orebro-osm-xml/archive/master.zip>

Also attempts to find OSM-duplicates in a radius of 500-5000 meters, which 
seems to work really well but could probably be improved by allowing a bit of 
Levenshtein distance, whitespace- and \p{Punct} normalization. Not sure how 
much this would help though, everything looks pretty great when inspecting 
manually.

Duplicates from source data are written to a common osm.xml (rather than 
written to their individual class-osm.xml) and the duplicates from OSM are 
written (with recursed children) to yet another osm.xml-file.



Script:

https://github.com/OpenStreetMap-Sverige/import-orebro-harvester/blob/master/src/main/java/se/kodapan/osm/orebro/Orebro.java
 
<https://github.com/OpenStreetMap-Sverige/import-orebro-harvester/blob/master/src/main/java/se/kodapan/osm/orebro/Orebro.java>

See line 335 and down to see exactly what classes there are and how the 
duplication detection mechanism works. (And sorry for all the Swedish language 
comments and names.)



We are now considering the workflow. 

Consensus on #osm...@irc.oftc.net <mailto:osm...@irc.oftc.net> is along the way 
"the data looks great, let's just commit it and then get started working on it 
as usual". The reaction on #osm has been quite the opposite "make sure any work 
is in the one single commit of the import account".

I've been considering asking all that can help with manual burdon och checking 
all points to do it at github, add any new things to OSM in a per 
user-changes.osm.xml  to avoid inverted identity conflicts and then to a merge 
before we committ it. If I understand everything correct then that would 
satisfy the people I spoke with on #osm.

That might be too much to ask of the users. And the data is really clean. We 
really want to just push it in the way it is and start working with it in the 
database as normal using a task project. 

_______________________________________________
Talk-se mailing list
Talk-se@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-se

Till