On Thu, Aug 18, 2011 at 6:23 AM, Jaak Laineste <[email protected]> wrote: > Hello, > > Based on my own long-time thinking and small talk in WhereCamp Berlin > I created request for comments on kind of different approach to > imports called meta-mapping.
Since this proposal is nearly (exactly) identical to a thought I had about a year ago, I feel pretty qualified to speak about it. The objective of a tool like this would be to allow someone to run a database of geographic data and isolate it from other datasets- that is by keeping the databases separate, one may allow for more flexibility in changing the data in one of the non-OSM datasets. An example would be if a city government's dataset were to add/remove listings of libraries, the conflation process in OSM would be harder than it would if there were simply a database where the information existed in isolation and then were linked to OSM. Simple, right? Sadly, the solution has flaws when the rubber meets the road. 1. By moving objects out of the OSM database, you move the complexity out of the OSM database and into the conflation database Moving the problem doesn't solve it. It just hides it (and you'll see why in a the next few points). 2. This approach implies that external data sets are correct. Underlying this approach is an assumption that we can rely on other datasets accuracy. Sadly this is not the case. As I work with more datasets and compare them to on the ground surveying, I find that many government datasets are either wrong, or out of date. Take TIGER as an example. I'm going through TIGER 2010 as we speak. Most of what i've found indicates that when OSM is active in an area, our maps are more accurate than TIGER, even TIGER 2010, which is more accurate than TIGER 2005 (what was imported in the US). We need therefore to encourage more mappers to map and not to rely on these external datasets. This project would do the opposite. 3. Data in the aggregated map won't be collected by on the ground mappers. Some data, like the road data, will appear in both OSM and external datasets, but there's other data which may just never get collected by the community, if the map appears to already be complete. And then since there's less on the ground mapping, the problems I mentioned earlier regarding flawed external datasets don't get noticed and corrected. 4. It assumes OSM object IDs remain constant. OSM object IDs change. They don't change a lot, but they do change, and you can't force users to jump through hoops to preserve them (as we've seen people propose). 5. It assumes external data sets IDs remain constant One of the whole points of this project seems to be to keep up to date with external datasets, such as those put out by local governments every quarter. Since most of these external datasets will be given in Shapefile format, there will need to be a conversion process. You can't be assured that the ID numbers on objects will remain constant from Q1 and Q2. Heck, I bet you'd find that even their own internal IDs won't remain constant, at least not for every single ID on every single object on every single external database, of which there may be dozens or more. So you're constantly in a race to conflate changing object IDs. 6. License nightmare This is a powder-keg ready to explode, but I'll just say this: Incompatible licenses will not allow this. 7. Tremendous work. The conflation process would be very hard to do, and frankly, not a lot of fun. You'll end up writing programs to do most of it I'm sure, but no programs will be perfect. So people have to do it, and, frankly, it's not fun work. These are the reasons I never went forward with this project. - Serge _______________________________________________ Imports mailing list [email protected] http://lists.openstreetmap.org/listinfo/imports
