On Tue, Mar 20, 2012 at 2:38 PM, [email protected] <[email protected]> wrote: > Daniel, > > In parallel of of the excellent work you are doing for these imports, I > suggest that we pursue the discussion on other aspects since the discussion > was going in all directions in 2009-2010. We must have some reflexions on > how to import this data properly.
I think we start by changing the way we thing about it. "Imports" break too often. We've seen imports fail in many different ways; stomping user data, duplicate imports, disconnected imports, nodes without ways, duplicate nodes, poor conversion from original format, low quality source data, import script user error. The list is, I fear, endless. I know that we've made improvements. The code keeps getting better. But it is still hard to "do it right". And we have a more difficult situation now, than we did in 2009-2010; there is existing OSM data in more places than there was back then. Almost everything now will have to deconflict with existing data. I'd like to suggest that we stop "importing", and start "referring to external sources". Here's the key difference: You map one object at a time while referring to external sources. So you can give each object the attention it needs. imports drop large numbers of object into the DB at one time. Each of that group of objects only gets a small share of your attention. I recently added municipal boundaries from Waterloo Region to OSM and wrote up the experience. http://opendataexpert.com/2012/using-waterloo-region-open-data/ It was a relatively long process, even though there were no existing boundaries for the region. Each boundary had to be reconciled with those of surrounding regions. It's great to have external sources. We have several of them to consider including aerial imagery, NRCan data, perhaps city data and more. We also have our local knowledge, survey track files, notes and photographs. And none of them agree 100% with each other. :-) All of our sources are lying to us and we have to make an educated judgment about what the best answer might be, given conflicting sources. It makes no sense to discard our good judgment, merely to accurately duplicate the errors of a single source. :-) So let's stop "importing" and start using external sources in smarter ways. _______________________________________________ Talk-ca mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-ca

