[OSM-dev] Moving to stricter multipolygon parsing, again
Hi Paul Norman Not so long time ago you have triggered a discussion with your article Moving to stricter multipolygon parsing. The discussion lasted two/three days, was intense and rather divergent (demonstrating that the multi-polygon issue is still not only for beginners). After that I have not seen or heard anything related to the issue on this forum. So, my question is what happened? As I understand, your suggestion is to allow tags only on the MPR level and these tags should be none-conflicting (well, the WIKI documentation shows that this was the original intention with MPR). The suggestion could have large positive impact on the (area class) data quality. Therefore I fully support the suggestion but yet I am not sure about the implementation. To shorten the discussion I would mention just a few arguments causing the dilemma to me. 1. Your implementation is based on the assumption that the mappers will check (lookup) the edits in a map that uses osm2pgsql as a parser. What if the MPR conversion to geometries is not using osm2pgsql? Here the mappers will still probably see the edits no matter where they put the tags. So, the restrictions should come much earlier, probably in the editor systems. 2. At the same time, inserting the suggested restrictions in editors will cause contradictions with the fundamental OSM documents. The WIKI sections defining and illustrating the Relation and MPR notions not only allow but even suggest putting tags on the members (even on border segments, on holes.). So, in my opinion, the restrictions should be first implemented in the OSM wiki documentation by refining/correcting the related sections. 3. Finally, the assumed do-ocracy (someone, once in the future, will detect and correct the error) does not work very well. There are many reasons to that. Let me mention two. There is a huge number of errors (significant and systematic not counting POI related and of semantic nature). So, it is maybe illusory to assume that the do-ocracy can cope with so many of them. Further, many of these errors are never visible in raster maps, mostly used by mappers. Consequently, do-ocracy will probably even not detect them. But the errors are there and in layered vector mapping these will be probably immediately visible. Just take the large number of river sections tagged as lakes (or the contrary), replicated or almost replicated areas/MPRs with different structures or just take the thousands of closed riverlines (waterway=river). Now, if these dilemmas are not only mine, then I would suggest an alternative implementation model: 1. An OSM voluntary expert team should go through the WIKI documentation and refine the MPR related notions and implement the restrictions. 2. The editor systems, used by mappers, should accordingly implement the mentioned restrictions. 3 The expert team should use programs to detect the MPR related systematic (versus random) errors and programmatically correct them in the source data. Thanks for the attention, Sandor. ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
On Fri, 13 Jun 2014, Peter Wendorff wrote: IMHO, and that's what most bothers me at the old interpretation of multipolygons, any tag that belongs to a closed way should be valid for that closed way. We don't inherit names from streets to bus route relations - why should we do so for names of polygons to multipolygons? Therefore my interpretation is, and IMHO it's the most intuitive interpretation, that the multipolygon relation describes it's own geometry, by referring to other objects (!) geometries, where these other objects may be features on their own or not. That means: Tags that are used on an outer closed way of a multipolygon relation should hold for the area enclosed by that way. Tags that are used on an outer non-closed way of a multipolygon relation should hold for that way, which is not closed and doesn't get more closed by other ways somehow related to it by a multipolygon. For inner member ways it's basically the same. I would agree with this. I've recently mapped an Open Space, and tagged on the outer way of its MP the name= tag. The MP itself has one outer, and two inner ways to form landuse=forest, but both the inner ways are also landuse=grass. I've always thought that that is the right way to do things. cheers, Derick ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
Komяpa me at komzpa.net writes: +1, spent a lot of time debugging issues when a tag from outer leaks into multipolygon itself. How would you handle tags such as created_by that are automatically removed by editors when ways are changed? -- Andrew ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
Paul Norman penorman at mac.com writes: 251k of these have entirely consistent tags on outers, How many of them only have one outer member? -- Andrew ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
+1 for consistency MP would be easier to learn from example if a single method 'works'. Yves On 13 juin 2014 01:25:42 UTC+02:00, Paul Norman penor...@mac.com wrote: Osm2pgsql currently tries *very* hard to turn multipolygon relations into geometries. It currently detects two types of MP relations, new-style and old-style. A new-style MP has tags on the relation while an old-style MP only has type=multipolygon on the relation and relies on the ways for the tags. It then tries to deal with odd tagging in various ways. MP handling is one of the biggest sources of osm2pgsql bug reports[1] and a big time-sink. One of the bigger issues is moving tags from ways to MPs that are falsely detected as old-style. This is an attempt to interpret flawed tagging. I think we need to move to a more strict parsing of MPs, accepting only new-style MPs and old-style MPs where all outers have identical non-deleted[2] tags and the relation itself has no non-deleted tags. Osm2pgsql is not just a consumer of data, it is one of the main feedback tools, so it is strongly integrated into the feedback cycle, so if osm2pgsql doesn't process a multipolygon, a mapper will likely correct the tagging. By doing this, it will make it easier for those interpreting raw OSM data. To support this, I looked for some numbers. Using a shortened deleted tags list, there are 1 million new-style and 261k old-style MPs. Of the old-style, 256k have a member with role outer. 251k of these have entirely consistent tags on outers, while 2.3k have two sets of tags among the ways. About 180 have three or more.[3] An old-style MP without entirely consistent tags on outers is ambiguous and in error. [1]: https://github.com/openstreetmap/osm2pgsql/search?q=multipolygontype=Issues [2]: A deleted tag is one such as source that osm2pgsql is dropping [3]: https://gist.github.com/pnorman/ebd41f5a1759916a48b5 ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev -- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
Of course, what is really needed is an area primitive type that incorporates the generic multipolygon structure. Then editing tools would always generate the correct tagging. Relations would then be left to describe associations between objects and not geometries as well. ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
+10 if we could enforce the strict usage in multipolygon relations this might as well be a step forward to a future area datatype as it would straighten the definition of how areas are defined currently, and start by a less ambiguous definition for the subset of areas described by multipolygon relations. regards Peter Am 13.06.2014 01:25, schrieb Paul Norman: Osm2pgsql currently tries *very* hard to turn multipolygon relations into geometries. It currently detects two types of MP relations, new-style and old-style. A new-style MP has tags on the relation while an old-style MP only has type=multipolygon on the relation and relies on the ways for the tags. It then tries to deal with odd tagging in various ways. MP handling is one of the biggest sources of osm2pgsql bug reports[1] and a big time-sink. One of the bigger issues is moving tags from ways to MPs that are falsely detected as old-style. This is an attempt to interpret flawed tagging. I think we need to move to a more strict parsing of MPs, accepting only new-style MPs and old-style MPs where all outers have identical non-deleted[2] tags and the relation itself has no non-deleted tags. Osm2pgsql is not just a consumer of data, it is one of the main feedback tools, so it is strongly integrated into the feedback cycle, so if osm2pgsql doesn't process a multipolygon, a mapper will likely correct the tagging. By doing this, it will make it easier for those interpreting raw OSM data. To support this, I looked for some numbers. Using a shortened deleted tags list, there are 1 million new-style and 261k old-style MPs. Of the old-style, 256k have a member with role outer. 251k of these have entirely consistent tags on outers, while 2.3k have two sets of tags among the ways. About 180 have three or more.[3] An old-style MP without entirely consistent tags on outers is ambiguous and in error. [1]: https://github.com/openstreetmap/osm2pgsql/search?q=multipolygontype=Issues [2]: A deleted tag is one such as source that osm2pgsql is dropping [3]: https://gist.github.com/pnorman/ebd41f5a1759916a48b5 ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
+1, spent a lot of time debugging issues when a tag from outer leaks into multipolygon itself. Also, I'd prefer to use not non-deleted tags, but the whole set of tags, as I'm currently using a stlyesheet with a large deletion list. This would make geometry interpretation stylesheet-independent. 2014-06-13 2:25 GMT+03:00 Paul Norman penor...@mac.com: Osm2pgsql currently tries *very* hard to turn multipolygon relations into geometries. It currently detects two types of MP relations, new-style and old-style. A new-style MP has tags on the relation while an old-style MP only has type=multipolygon on the relation and relies on the ways for the tags. It then tries to deal with odd tagging in various ways. MP handling is one of the biggest sources of osm2pgsql bug reports[1] and a big time-sink. One of the bigger issues is moving tags from ways to MPs that are falsely detected as old-style. This is an attempt to interpret flawed tagging. I think we need to move to a more strict parsing of MPs, accepting only new-style MPs and old-style MPs where all outers have identical non-deleted[2] tags and the relation itself has no non-deleted tags. Osm2pgsql is not just a consumer of data, it is one of the main feedback tools, so it is strongly integrated into the feedback cycle, so if osm2pgsql doesn't process a multipolygon, a mapper will likely correct the tagging. By doing this, it will make it easier for those interpreting raw OSM data. To support this, I looked for some numbers. Using a shortened deleted tags list, there are 1 million new-style and 261k old-style MPs. Of the old-style, 256k have a member with role outer. 251k of these have entirely consistent tags on outers, while 2.3k have two sets of tags among the ways. About 180 have three or more.[3] An old-style MP without entirely consistent tags on outers is ambiguous and in error. [1]: https://github.com/openstreetmap/osm2pgsql/search?q=multipolygontype=Issues [2]: A deleted tag is one such as source that osm2pgsql is dropping [3]: https://gist.github.com/pnorman/ebd41f5a1759916a48b5 ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev -- Darafei Komяpa Praliaskouski OSM BY Team - http://openstreetmap.by/ xmpp:m...@komzpa.net mailto:m...@komzpa.net ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
2014-06-13 1:25 GMT+02:00 Paul Norman penor...@mac.com: I think we need to move to a more strict parsing of MPs, accepting only new-style MPs and old-style MPs where all outers have identical non-deleted[2] tags and the relation itself has no non-deleted tags. +1 There is really only one usecase where I abuse the fuzziness of the old style: urban squares. While you often can't walk on all of their surface (e.g. there might be a fountain, a sculpture, buildings, green, etc. to exclude from highway pedestrian) the name will usually be for all of it. Adding only a name also doesn't solve it, because then it is not clear which kind of name it is (typology). This isn't really solved with old style MPs neither, of course, but at least this is less obvious and might be interpreted correctly by a human ;-) ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
Paul Norman schrieb am 13.06.2014 01:25: To support this, I looked for some numbers. Using a shortened deleted tags list, there are 1 million new-style and 261k old-style MPs. Of the old-style, 256k have a member with role outer. 251k of these have entirely consistent tags on outers, while 2.3k have two sets of tags among the ways. About 180 have three or more.[3] An old-style MP without entirely consistent tags on outers is ambiguous and in error. To support this change it would be nice to setup a list on the web with the buggy relations. A few ten thousands broken Wikipedia tags were corrected with such a list. Perhaps the change could be enforced later, after a lot of MPs are corrected. -- regards Holger ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
Thing is, it may be easier to find a consensus in -dev than elsewhere. So a fixing list would be a good thing, indeed. Yves On 13 juin 2014 16:20:09 UTC+02:00, Holger Jeromin mailgm...@katur.de wrote: Paul Norman schrieb am 13.06.2014 01:25: To support this, I looked for some numbers. Using a shortened deleted tags list, there are 1 million new-style and 261k old-style MPs. Of the old-style, 256k have a member with role outer. 251k of these have entirely consistent tags on outers, while 2.3k have two sets of tags among the ways. About 180 have three or more.[3] An old-style MP without entirely consistent tags on outers is ambiguous and in error. To support this change it would be nice to setup a list on the web with the buggy relations. A few ten thousands broken Wikipedia tags were corrected with such a list. Perhaps the change could be enforced later, after a lot of MPs are corrected. -- regards Holger ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev -- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
Holger Jeromin wrote: To support this change it would be nice to setup a list on the web with the buggy relations. (apologies for asking what might be the bleeding obvious but) Do any of the existing QA tools flag multipolygon outers with conflicting tags? Alternatively, could (or does it already?) osm2pgsql write a list of conflicts as it encounters them? Cheers, Andy ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
On 2014-06-13 8:17 AM, Serge Wroclawski wrote: Paul, I don't have anything technical to add but I have a suggestion or two: 1. If this is an area where the old multipolygons could be converted entirely to the new style- do you propose an automated edit to OSM? No. There's .25M of them or so. 2. If not, are there instructions we could do to OSM editors? If so, then perhaps this would be a good Advanced MapRoulette challenge. Most of the MR challenges are for beginners, but this might be a nice one for our more advanced user community. For the incorrect ones, yes. I've built a list of ones with 3+ distinct sets of tags. http://paulnorman.ca/files/3_plus.txt I'll fix formatting in a bit, but there's 222 relations there. The other column is the number of distinct. ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Moving to stricter multipolygon parsing
IMHO, and that's what most bothers me at the old interpretation of multipolygons, any tag that belongs to a closed way should be valid for that closed way. We don't inherit names from streets to bus route relations - why should we do so for names of polygons to multipolygons? Therefore my interpretation is, and IMHO it's the most intuitive interpretation, that the multipolygon relation describes it's own geometry, by referring to other objects (!) geometries, where these other objects may be features on their own or not. That means: Tags that are used on an outer closed way of a multipolygon relation should hold for the area enclosed by that way. Tags that are used on an outer non-closed way of a multipolygon relation should hold for that way, which is not closed and doesn't get more closed by other ways somehow related to it by a multipolygon. For inner member ways it's basically the same. This should IMHO not be dependent on any kind of tags (besides the area=yes for closed ways where the closedness is not clearly implied by other tags). regards Peter Am 13.06.2014 16:10, schrieb Martin Koppenhoefer: 2014-06-13 1:25 GMT+02:00 Paul Norman penor...@mac.com: I think we need to move to a more strict parsing of MPs, accepting only new-style MPs and old-style MPs where all outers have identical non-deleted[2] tags and the relation itself has no non-deleted tags. +1 There is really only one usecase where I abuse the fuzziness of the old style: urban squares. While you often can't walk on all of their surface (e.g. there might be a fountain, a sculpture, buildings, green, etc. to exclude from highway pedestrian) the name will usually be for all of it. Adding only a name also doesn't solve it, because then it is not clear which kind of name it is (typology). This isn't really solved with old style MPs neither, of course, but at least this is less obvious and might be interpreted correctly by a human ;-) ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
[OSM-dev] Moving to stricter multipolygon parsing
Osm2pgsql currently tries *very* hard to turn multipolygon relations into geometries. It currently detects two types of MP relations, new-style and old-style. A new-style MP has tags on the relation while an old-style MP only has type=multipolygon on the relation and relies on the ways for the tags. It then tries to deal with odd tagging in various ways. MP handling is one of the biggest sources of osm2pgsql bug reports[1] and a big time-sink. One of the bigger issues is moving tags from ways to MPs that are falsely detected as old-style. This is an attempt to interpret flawed tagging. I think we need to move to a more strict parsing of MPs, accepting only new-style MPs and old-style MPs where all outers have identical non-deleted[2] tags and the relation itself has no non-deleted tags. Osm2pgsql is not just a consumer of data, it is one of the main feedback tools, so it is strongly integrated into the feedback cycle, so if osm2pgsql doesn't process a multipolygon, a mapper will likely correct the tagging. By doing this, it will make it easier for those interpreting raw OSM data. To support this, I looked for some numbers. Using a shortened deleted tags list, there are 1 million new-style and 261k old-style MPs. Of the old-style, 256k have a member with role outer. 251k of these have entirely consistent tags on outers, while 2.3k have two sets of tags among the ways. About 180 have three or more.[3] An old-style MP without entirely consistent tags on outers is ambiguous and in error. [1]: https://github.com/openstreetmap/osm2pgsql/search?q=multipolygontype=Issues [2]: A deleted tag is one such as source that osm2pgsql is dropping [3]: https://gist.github.com/pnorman/ebd41f5a1759916a48b5 ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev