Hi. This is an update to an e-mail I sent at the beginning of October to the talk@osm list regarding updating postal codes in Iceland semi-automatically.
I wanted to let you know I have written the script, which is for Python 3.2. I have not yet submitted data made by the script but I haven't detected any problems thus far. I have performed some random manual checks on the output and see nothing wrong with the XML. JOSM didn't complain when I opened the .osc file. The input is any valid .osm file and the output is an .osc file ( https://wiki.openstreetmap.org/wiki/Osc) which lists any changes made. The output can be loaded into an editor and submitted to the OSM server from there. You're free to adapt the script to suit your purpose but I recommend that you always check the proposed changes before uploading. The code is commented enough so anybody who knows Python should be able to know what's going on there. Minimum requirements: - Enough computer memory. The larger the .osm file, the more memory the script needs. - Python 3. - A working installation of the Osmosis program ( https://wiki.openstreetmap.org/wiki/Osmosis). - Svavar Kjarrval On 04/10/12 23:48, Martin Guttesen wrote: > I have imported all the addresses for Faroe Islands > and updating them from time to time when there is new data available > see http://wiki.openstreetmap.org/wiki/Import/Catalogue/usfo > i keep an Id tag (us.fo:Adressutal) so i can Create/Update or Delete > address nodes > > > -----Original Message----- From: Jochen Topf > Sent: Thursday, October 04, 2012 7:39 AM > To: Svavar Kjarrval > Cc: [email protected] > Subject: Re: [OSM-talk] Semi-automated edits - postal code database > > Hi! > > On Wed, Oct 03, 2012 at 11:10:05AM +0000, Svavar Kjarrval wrote: >> I'm trying to find a good method to maintain data from outside sources. >> The data in question is the Icelandic postal code database (which they >> say we may use freely). My searches on the OSM wiki have been fruitless >> so far. >> >> The idea is to maintain the data in associatedStreet relations. Each >> relation has a tag called 'götuskrá:id' which value is a direct >> reference to the row ID in the files we retrieve from the postal >> company's website. The file formats available are CVS and XML 1.0. The >> script would presumably go ever each associatedStreet relation and make >> any changes (if appropriate) when a götuskrá:id tag is found. The output >> could be an OSM change file loaded into an editor like JOSM to be >> uploaded manually. Maybe an automated process later when we're confident >> that everything is done correctly, and of course after submitting the >> script(s) for review by the local community. > > It is not a good idea to add some random ID of your favourite database to > OSM, because nobody except you can understand this ID and do useful > things > with it. It just confuses mappers and make it more difficult to edit the > data. For every change somebody does to the data they have to know > what this > tag means so that they can properly do their edit. And if they don't, > people > will just mess up your data and you will not be able to use this ID for > syncing the data anyways. > > And in this case I don't even see why you need it. You have street > names and > postal codes in both OSM and the Icelandic postal code database. If > something > changes you can find out which combinations changed and apply those > changes > to OSM easily just based on the postal code and street name. There is no > need for those IDs. > > And, btw, you should not use the associatedStreet relation. It solves > the same > problem as the addr:street tags on nodes and buildings but in a much more > complicated way. The overwhelming majority of all addresses are tagged > with > addr:street (there are nearly 15 million addr:street tags vs. only 18.000 > associatedStreet relations). > > Jochen
#!/usr/bin/env python3.2 # -*- coding: utf-8 -*- # Copyright 2012, Svavar Kjarrval Lúthersson # Released under the CC0 license. # I can be contacted at [email protected]. # This program performs changes according to pretermined formulas to .osm files # and outputs a single .osc file which in turn can either be submitted automatically # by another program (which is not implemented here) or manually with an editor. # To use it, you must have: # 1 - An .osm file of the area in question. # 2 - An Osmosis binary set up and ready to use. # The reason the script filters instead of working directly on the original file # is to reduce memory consumption of programs which need to load the complete .osm file into memory. # If, despite having done proper filtering, the .osm file is still too big to fit into memory, # please consider splitting the area further. import os import xml.etree.cElementTree as etree # Change the value of DEBUG to 0 when you don't want extra debug messages to appear on screen. DEBUG = 0 # Get the current working directory pwd = os.getcwd() + '/' # Location of the osmosis binary. osmosis_bin = '/home/kjarrval/bin/osmosis' # A recently-updated .osm file with an extract of the area of interest from OpenStreetMap. # This script does not change this file. original_osm_file = 'iceland.osm' # Name of the filtered .osm file. It will be completely overwritten each time the script runs. filtered_osm_file = 'osmosis_filtered.osm' # Name of the finished .osm file. It will be completely overwritten each time the script runs finished_osm_file = 'osmosis_finished.osm' # The finished .osc file. It will be completely overwritten each time the script runs. finished_osc_file = 'osmosis_finished.osc' # The filter to use on the original file. # See https://wiki.openstreetmap.org/wiki/Osmosis/Detailed_Usage#--tag-filter_.28--tf.29 for usage. # The Osmosis filter is processed in order osmosis_filter_to_use = ' --tf accept-relations götuskrá:id=*' osmosis_filter_to_use += ' --tf reject-ways' osmosis_filter_to_use += ' --tf reject-nodes' # Run the osmosis command osmosis_command = osmosis_bin if DEBUG == 0: osmosis_command += ' -q' osmosis_command += ' --read-xml ' + pwd + original_osm_file + ' ' + osmosis_filter_to_use osmosis_command += ' --write-xml ' + pwd + filtered_osm_file # Debug if DEBUG == 1: print(osmosis_command) # Let's run the Osmosis command os.system(osmosis_command) # Now we should have a filtered .osm file # Now let's work on running whatever script on the data we want. # The if condition is to simplify for people where the data processing starts and ends. # 'götuskrá:id' is a reference to the entry ID in the postcode file from the Icelandic Postal Service. if 1: osm_xml = etree.parse(pwd + filtered_osm_file) postcodes_xml = etree.parse(pwd + 'gotuskra.xml') # Process the postcodes file into a dictionary street = {} for element in postcodes_xml.iter("Gata"): # print(element[0].text) street[element[0].text] = [element[1].text,element[2].text,element[3].text] # print(repr(street)) # Go through every relation for element in osm_xml.iter("relation"): tags_arr = {} tags = element.iterfind('tag') # All tags put into a dictionary which can be referenced by key name. for tag in tags: tags_arr[tag.get('k')] = tag.get('v') # Check if the relation has the key götuskrá:id. # If it doesn't have it, skip to the next relation. if 'götuskrá:id' not in tags_arr: continue # Verify that the street names match. # If they don't, the götuskrá:id has a typo or the streetname. # In which case, it should be left alone instead of populating # the relation with potentially wrong data. if tags_arr['name'] == street[tags_arr['götuskrá:id']][1]: # The götuskrá:id and streetname match # Now we only need to check what needs to be changed and change it. # Check if there is a tag with addr:postcode. If not, add it. if 'addr:postcode' not in tags_arr: attrib = {'k':'addr:postcode','v':street[tags_arr['götuskrá:id']][0]} etree.SubElement(element,'tag',attrib) else: # Finds the occurance of the tag where addr:postcode is. ele = element.find("tag/[@k='addr:postcode']") # Change the value to the one according to the postcode file. ele.set('v',street[tags_arr['götuskrá:id']][0]) # We want the noun case inflection of the street names as an alternative names. # But we don't want to overwrite any other alternative names # so we leave it alone if one exists. if 'alt_name' not in tags_arr: attrib = {'k':'alt_name','v':street[tags_arr['götuskrá:id']][2]} etree.SubElement(element,'tag',attrib) # Open a file handler to write the new .osm file and write it. osm_xml.write(pwd + finished_osm_file) # Now we have written a finished .osm file and all that's left is to make Osmosis # compare it to the filtered file and generate an .osc file which only has the changes. osmosis_command = osmosis_bin if DEBUG == 0: osmosis_command += ' -q' osmosis_command += ' --read-xml file="' + pwd + finished_osm_file + '"' osmosis_command += ' --read-xml file="' + pwd + filtered_osm_file + '"' osmosis_command += ' --dc --write-xml-change file="' + pwd + finished_osc_file + '"' # Debug if DEBUG == 1: print(osmosis_command) os.system(osmosis_command)
signature.asc
Description: OpenPGP digital signature
_______________________________________________ talk mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk

