Hello everyone, Original Poster here. Thanks for your remarks. A lot of people active in the Belgian community are following this discussion with interest and some bewilderment. Like the first message, this is a reply that has been written by several members together. We did have a hard time filtering the actual discussion about our import from the thread. We would kindly invite everyone to take the discussion of other imports and the more philosophical points to a more appropriate place (e.g. the talk-list) and stick to the topic of the Flemish buildings import.
We have been working on this for two years and don't mind working on it for a few more months, before running the actual integration. We are, in the first place, looking for advice and guidance so as to further improve our methods - together with with a go-ahead once all issues are resolved. In case this was not clear before: this is _not_ a flat import where all the data is dumped into OSM automatically. This is an effort of the Belgian community, where mappers select a subset of the data they choose to integrate piecemeal, usually one street at a time. The data is loaded in JOSM and checked against the aerial imagery involving the mapper's common sense. As an example, this is a screenshot of the tool in action: https://matrix.org/_matrix/media/v1/download/matrix.org/WmemnfTQZOSfoKoIlByFBHmv . You can see that the tool reuses tags from OSM and is offering the mapper the choice which geometry should be used (by not saving the import). In combination with aerial imagery, this should cause little to no problems in regard with the original OSM-data. Mateusz and Frederik, your points regarding documentation are being addressed as we speak. # External IDs The most discussed point seems to be the IDs we wanted to include. We believe the IDs will significantly ease updates to our buildings when the GRB is updated, and make the whole process more robust. They provide a way to update and cross reference the data now and in the future. Exactly by keeping these IDs, updating data down the road will be smoother and prevent later changes from OSM contributors to be overwritten. As an added benefit, keeping IDs makes the import tool-agnostic. It will also make it much easier to flag errors in the source data. We aren't afraid that people will refrain from editing source-tagged objects: new users (using the iD editor) will probably not notice those tags in the first place; advanced users will know about them. And intermediate users will probably look them up. Merging and splitting are rather rare operations – once the geometry is accurate, they should become unnecessary operations. Finally, as the data set is available for free under an open license, anyone can verify the data independently (though not on the ground, of course). We will address the worries about the external IDs on a point by point basis below: >From *Frederik Ramm:* - *If you delete a building* *that has such an ID, how will you ensure i* *t** isn't brought in again* *through a later "update import"? Etc.**"* There will not be "an update import". The tool is built for continuous use. It will improve the geometry of existing buildings and already looks at the source tags. The tool is built for heavy mappers who will be monitored by our closely knit community. Importing and updating will be done street by street or block by block basis. We believe we can trust these mappers to analyze situations like this on a case by case basis. In this case, whether of not the deleted building had an ID does not matter much. These points, respectively by Frederik Ramm, Christoph Hormann and Mateusz Konieczny are quite similar. *We'll answer them together.* - *The idea of having an "audit trail" for every single geometry by way of an Id for that individual geometry is interesting, but I think that it is totally sufficient if a changeset carries the information that this changeset has been imported from XYZ data source at time stamp T; everything else can be researched down the line if the need should* - * For this purpose it is completely unnecessary to bother the OSM community with external IDs. If you want to check if the data has been unchanged since you added it then do exactly that - check if there are any newer versions of the objects that have originally been added in the import.* - ** What is the point of adding tags like source:geometry:entity given that like any other tags they may be edited once added to OSM?* It is not, and has never been, the intention to fully automate this. As we are trying to make clear, this info is needed and used all the time, and not at some vague "maybe we do an update sometime". We are not using the IDs to make a direct (full database) comparison to see what exists in one and not in the other on a gigantic scale. We are not looking for an 'auto update OSM with all new buildings added to GRB'. We use the IDs, on a SINGLE BUILDING comparison basis only to see what's changed (geometry touch-ups, or entirely replacing a building). "THEY WON'T BE STABLE ANYWAY" If the external references are edited, the tool will flag the building as needing an update. The only thing a (deliberate or accidental) unneeded change of the ID would do, is alert the tool's users that something doesn't add up. Either the building has been replaced in reality (and thus having a new UIDN), and you can improve the geometry of the new building. If no apparent change to the physical building can be traced, the UIDN can be restored. The amount of 'false positives' due to unintended edits of the IDs is expect not to come remotely close to the useful flaggings. Without the tags, it's hard to tell which buildings have been imported: you would need complicated spatial heuristics because we don't blindly copy buildings, we improve them through other sources. Once mapped, the geometry changing would happen more often than the tags changing, so we'd have a lot more false positives. "WHY NOT JUST CREATE A DATABASE OF LINKS EXTERNALLY?" In theory it would be possible to have the tool keep a register of which OSM ID maps to which ID in the GRB or to evaluate changesets to get similar info. Some issues with this approach: - Because we aren't doing an automatic import but manually add the buildings through JOSM, this would require us to copy the OSM ID of each building manually, or rely on heuristics. - We don't want to centralize the link between OSM and GRB. There isn't one single person doing the import, it's something several people in the community work on. Anyone could host the tool if its current maintainer disappears. - Pushing the analysis of what exactly has happened towards a changeset is an unnecessary burden on a mapper. The changeset will not just contain "here be new buildings", but also "in this case, we just used part of the geometry of the GRB building, but we left part of the building geometry intact because that was better in OSM". After having analysed the changeset, the mapper would still have to look up in an import database what exactly the relationship between the OSM and GRB objects was at the time of import. - An external database providing the link between OSM and GRB objects would be outdated after a day and almost impossible to update with changes on the OSM side "WHY NOT JUST COMPARE GEOMETRIES?" Doing a geometrical analysis to analyze differences would be impractical, not just because this would be computationally heavy, but also because it would lead to too much false positives. For example because of tiny changes, but also because we do not blindly use source geometries since we first address overnoding. I hope that all confusion is cleared now and that we can move this issue forward. With Regards, The Belgian Mappers, Pieter Vander Vennet
_______________________________________________ Imports mailing list [email protected] https://lists.openstreetmap.org/listinfo/imports
