(CC-ing to OSM-dev in case anyone else finds this interesting)

Background: I've been looking for some sort of XML-SAX-like engine to
do massive changes to OSM data so that I can do mass-edits without
writing all the tedious parsing/upload/diff logic myself.

change_tags.py looks perfect for this with a few caveats, some of
which have patches & suggestions for:

Firstly it doesn't handle relations, in the attached patch I've
implemented support for this which seems to work, except:

* The Nodes/Ways/Relations count doesn't seem to work.

This appears to be due to some race condition in the XML parser I
can't find, if I change the counting code around line 300 to this:

        if name in ['node', 'way', 'relation', 'tag']:
            if name == 'relation':
                print "Read relation"
            self.read[name] += 1

And run this command:

PYTHONPATH=. python
~/src/osm/applications/utils/change_tags/change_tags.py --dry-run
--verbose --check-api --file Iceland.osm  --module iceland --function
test
Nodes: 259621/0, Ways: 5288/0, Relations: 0/0, 0 complete
Read relation
Read relation
Read relation
[...]

You can see the relations are being read /after/ the count has been
made, but I don't know why.

* I haven't

I just hacked it up, I tried writing a test scripts which changes a
relation and it seems to work (attached as iceland.py), the output
makes sense, here I'm changing name="Vatnsnesvegur" to name="Fleh":

$ PYTHONPATH=. python
~/src/osm/applications/utils/change_tags/change_tags.py --dry-run
--verbose --check-api --file Iceland.osm  --module iceland --function
test
Nodes: 259621/0, Ways: 5276/0, Relations: 0/0, 0 complete
URL:  http://api.openstreetmap.org/api/0.5/relation/106216
XML:
<osm version="0.5">
  <relation id="106216">
    <member ref="23699605" role="" type="way" />
    <member ref="24215960" role="" type="way" />
    <tag k="created_by" v="Potlatch 0.10f" />
    <tag k="name" v="Fleh" />
    <tag k="network" v="T" />
    <tag k="ref" v="711" />
    <tag k="route" v="road" />
    <tag k="type" v="route" />
  </relation>
</osm>
Total Read: 259621 nodes, 13344 ways, 259 relations, 41054 tags
Total Changed: 0 nodes, 0 ways, 1 relations
Previously Changed: 0 nodes, 0 ways, 0 relations

But I haven't uploaded any data with the altered tool.

* The interface is too limited

The script does all the work of figuring out the nodes in a given way
(and with my patch, the members) as well as various other stuff being
stored as member variables in the ChangeTags class, but the user
doesn't get access to this. In my case I wanted to get the members of
relations, I can do this by changing this:

    run = self.converter(self.current['tags'], self.current['type'])

to this:

    run = self.converter(self.current['tags'], self.current['type'],
self.current)

or even this:

    run = self.converter(self.current)

I can then do this in my script:

def test(current)
    if current['type'] == 'relation':
        current['tags']['name'] = 'Fleh'
        current['members'] = [{ 'type': 'way', 'ref': '1234', 'role':'wohoo'}
        return True
    return False

And get this as a result:

URL:  http://api.openstreetmap.org/api/0.5/relation/106216
XML:
<osm version="0.5">
  <relation id="106216">
    <member ref="1234" role="wohoo" type="way" />
    <tag k="created_by" v="Potlatch 0.10f" />
    <tag k="name" v="Fleh" />
    <tag k="network" v="T" />
    <tag k="ref" v="711" />
    <tag k="route" v="road" />
    <tag k="type" v="route" />
  </relation>

I don't care what the interface looks like or what it's called, I just
want to be able to access that data, how do *you* think it should look
like? :)
Index: change_tags.py
===================================================================
--- change_tags.py	(revision 14375)
+++ change_tags.py	(working copy)
@@ -145,10 +145,10 @@
 
         self.last_report_time = time.time() 
         self.last_report_object = 0
-        self.changes = {'node': 0, 'way': 0} 
-        self.already_changed = {'node': 0, 'way': 0} 
-        self.read = {'node': 0, 'way': 0, 'tag': 0} 
-        self.total_read = {'node': obj_counts.get('node', 0), 'way': obj_counts.get('way', 0)}
+        self.changes = {'node': 0, 'way': 0, 'relation': 0} 
+        self.already_changed = {'node': 0, 'way': 0, 'relation': 0} 
+        self.read = {'node': 0, 'way': 0, 'tag': 0, 'relation': 0} 
+        self.total_read = {'node': obj_counts.get('node', 0), 'way': obj_counts.get('way', 0), 'relation': obj_counts.get('relation', 0)}
         self.errors = []
         self.skipped = []
         
@@ -182,6 +182,15 @@
                 for i in item.getElementsByTagName("nd"):
                     nodes.append(i.getAttribute("ref"))
                 self.current['nodes'] = nodes
+            elif self.current['type'] == 'relation':
+                members = []
+                for i in item.getElementsByTagName("member"):
+                    type = i.getAttribute("type")
+                    ref  = i.getAttribute("ref")
+                    role = i.getAttribute("role")
+                    members.append({ 'type': type, 'ref': ref, 'role': role});
+                self.current['members'] = members
+
             tags = {}
             for i in item.getElementsByTagName("tag"):
                 tags[i.getAttribute("k")] = i.getAttribute("v")
@@ -191,14 +200,14 @@
             raise Exception("Couldn't update from API server.")
 
     def progress(self):
-        upload = self.changes['node'] + self.changes['way'] + \
-            self.already_changed['node'] + self.already_changed['way']
+        upload = self.changes['node'] + self.changes['way'] + self.changes['relation'] + \
+            self.already_changed['node'] + self.already_changed['way'] + self.already_changed['relation']
         
         t = time.time() - self.last_report_time
         
         if t > 10:
            
-            obj_count = "Nodes: %s/%s, Ways: %s/%s" % (self.read['node'], self.total_read['node'], self.read['way'], self.total_read['way'])
+            obj_count = "Nodes: %s/%s, Ways: %s/%s, Relations: %s/%s" % (self.read['node'], self.total_read['node'], self.read['way'], self.total_read['way'], self.read['relation'], self.total_read['relation'])
 
             c = upload - self.last_report_object
             rate = float(c/t)
@@ -277,15 +286,31 @@
             self.current = {'type': 'way', 'id': attr['id'], 'nodes':[], 'tags': {}}
         elif name =='nd' and self.current:
             self.current['nodes'].append(attr["ref"])
+        elif name == 'relation':
+            self.current = {'type': 'relation', 'id': attr['id'], 'members':[], 'tags': {}}
         elif name == 'tag' and self.current:
             self.current['tags'][attr['k']] = attr['v']
+        elif name == 'member' and self.current:
+            self.current['members'].append({ 'type': attr['type'], 'ref': attr['ref'], 'role': attr['role']});
         if 'user' in attr and self.current:
             self.current['user'] = attr['user']
         
-        if name in ['node', 'way', 'tag']:
+        if name in ['node', 'way', 'relation', 'tag']:
             self.read[name] += 1
 
     def makeXML(self):
+        if self.current['type'] == 'relation':
+            osm = Element('osm', {'version': '0.5'})
+            
+            parent = SubElement(osm, 'relation', {'id': self.current['id']})
+            for m in self.current['members']:
+                SubElement(parent, 'member', {'type': m['type'], 'ref': m['ref'], 'role': m['role']})
+
+            keys = self.current['tags'].keys()
+            keys.sort()
+            for key in keys:
+                SubElement(parent, "tag", {'k': key, 'v': self.current['tags'][key]})
+        
         if self.current['type'] == "way":
             osm = Element('osm', {'version': '0.5'})
 
@@ -313,8 +338,8 @@
         indent(osm)
         return tostring(osm)
     def endElement (self, name):
-        """Switch on node type, and serialize to XML for upload or print."""
-        if name in ['way', 'node']:
+        """Switch on node, type, relation and serialize to XML for upload or print."""
+        if name in ['way', 'node', 'relation']:
             new_tags = self.converter(self.current['tags'], self.current['type'])
             if new_tags:
                 self.upload()
@@ -435,10 +460,10 @@
         if hash != converter.func_code.co_code:
             print "The converter function changed, and you haven't run another dry run. Do that first."
             sys.exit(3)
-        node_changes, way_changes = map(int, changes.split("|"))
-        node_read, way_read = map(int, read.split("|"))
-        changes = node_changes + way_changes
-        read = {'node': node_read, 'way': way_read} 
+        node_changes, way_changes, relation_changes = map(int, changes.split("|"))
+        node_read, way_read, relation_read = map(int, read.split("|"))
+        changes = node_changes + way_changes + relation_changes
+        read = {'node': node_read, 'way': way_read, 'relation': relation_read} 
         if changes > 1000:
             print "You are changing more than 1000 objects. Ask crschmidt for the special password."
             pw = raw_input("Secret Phrase: ")
@@ -488,21 +513,21 @@
             print "Stopping due to Exception: \n" % E 
         failed = True    
 
-    print "Total Read: %s nodes, %s ways, %s tags"  % (
-           osmParser.read['node'], osmParser.read['way'], osmParser.read['tag'])
+    print "Total Read: %s nodes, %s ways, %s relations, %s tags"  % (
+           osmParser.read['node'], osmParser.read['way'], osmParser.read['relation'], osmParser.read['tag'])
     
-    print "Total Changed: %s nodes, %s ways"  % (
-           osmParser.changes['node'], osmParser.changes['way'])
+    print "Total Changed: %s nodes, %s ways, %s relations"  % (
+           osmParser.changes['node'], osmParser.changes['way'], osmParser.changes['relation'])
 
-    print "Previously Changed: %s nodes, %s ways"  % \
-          (osmParser.already_changed['node'], osmParser.already_changed['way'])
+    print "Previously Changed: %s nodes, %s ways, %s relations"  % \
+          (osmParser.already_changed['node'], osmParser.already_changed['way'], osmParser.already_changed['relation'])
     
     if not failed and options.dry_run and options.file:
         f = open("%s.dry_run" % options.file, "w")
-        f.write("%s\n|||\n%i|%i\n|||\n%i|%i" % \
+        f.write("%s\n|||\n%i|%i|%i\n|||\n%i|%i|%i" % \
                (converter.func_code.co_code, 
-                osmParser.changes['node'], osmParser.changes['way'], 
-                osmParser.read['node'], osmParser.read['way']
+                osmParser.changes['node'], osmParser.changes['way'], osmParser.changes['relation'],
+                osmParser.read['node'], osmParser.read['way'], osmParser.read['way']
                 ))
         f.close()
     
# -*- coding: utf-8 -*-
__rulesetname__ = "iceland"
__version__ = "0.1"
__author__ = "Ævar Arnfjörð Bjarmason <[email protected]>"

import re

def created_by():
    return "change_tags.py: %s %s" % (__rulesetname__, __version__)

def test(tags, type):
    """For grid streets with names such as "South 400 West St", remove
       the "St" or "Street" suffix because it is not used."""

    if type == 'relation' and "type" in tags and tags['type'] == 'route' and 'route' in tags and tags['route'] == 'road' and 'name' in tags and tags['name'] == 'Vatnsnesvegur':
        tags['name'] = 'Fleh'
        return True;

    return False

# vim: ts=4 sw=4 et
_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to