On Tue, Oct 12, 2010 at 4:36 AM, Peter Budny <[email protected]> wrote:
> If route relations are not required, then what are > http://wiki.openstreetmap.org/wiki/Relation:route#Road_Routes for? Not required and "don't exist" aren't quite the same things. One major issue with relations in general is very little software knows how to handle them, and that's especially true for things like routing software, but that's not at the core of my concerns, which I'll elaborate on later in the mail. > They /are/ required, because roads may be discontiguous in various ways: > a road may change names (e.g. Main Street North becomes Main Street > South, but to a driver or pedestrian, both are just one continuous Main > Street), or even be physically discontiguous (some state and even US > Highways do this). I'm a little confused by this example. "Main Street North becomes Main Street" - how would you handle this? What specifically would you do? Add a relation? What tags would you add, or remove, from the individual ways? > Using TIGER data, we can automate the > process, but the bot's work will not be perfect; humans will still have > to check it and make a few corrections. Still, if it does 95% of the > work for them correctly, this is pretty good IMO. (After all, TIGER data > itself is not even close to 95% correct.) You've identified several issues in this paragraph, and I'd like to flush them out: 1) Your data source, TIGER, is by your own admission, not accurate. I don't want to get into a discussion about TIGER (that may be best left for osm-us), but when you start with a dataset as, let's say "controversial" as TIGER, you can expect a lot of concerns from the community. 2) You say that humans will have to check it and make corrections. What mechanism do you propose to integrate into your mass-edits which would integrate human validation? In other words, how do you plan on accomplishing the human validation step before modifying the database? Now some concerns that the community probably has, but isn't articulating. 3) The road to hell in OSM is paved with bot intentions. OSM has a long, negative history with bots. We have a very small number of good imports, and dozens (if not more) bad imports. Bad imports are so commonplace in OSM that within the OSM community, bots of any sort are discouraged, but especially any imports, and especially (as you appear to be proposing), merging existing data with imported data. This is a recipe for disaster. 4) How well do you know OSM? Elaborating on my previous point, OSM is a very attractive project and very smart folks come to it all the time with a great idea about how a bot or an import could be very beneficial. Unfortunately, while these people may understand the data, and maybe the representation, unless you're familiar with OSM, you don't know the pitfalls that come in and cause the most problems. Here's a small but real example: Let's say your import is chugging along, and then it comes across an area where someone's already done the work. How will it react? Would it overwrite the contributor's work? Would it stop? If it stops, would it know which segments have been committed to the DB and which haven't (ie would it be able to prevent duplicates?) Would your bot handle tags which users may have added to the way, or relation? And so on... This is why the observation about bots we have is that no one who has been with the project < a year should do them. And most people who suggest making bots have been with the project < 6 months. 5) Academic Research I think that it's great that academics are interested in using OSM for their research. But at the same time, I've worked in academic computing for most of my professional career, surrounded by some of the smartest people in their field, at both NIH, and NASA. These are the best of the best. And my view of much of what's produced by academics who write software is that it's poo-poo. My specific concern here is that your research is focused in a very narrow way, which is understandable for a school project, but has implications to the larger project which might not work out. My suggestion to you is that you take the planet.osm, write your code for school using your own OSM sandbox. Then publish the results, and then work with the community later on regarding applying your research to the live map. - Serge _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

