On Fri, 26 Jun 2009, William Lachance wrote: > Hi all, > > I've been needing to use geobase2osm in a project of mine (not > directly related to the geobase import), and noticed (as other people > have) that it's generating a lot of duplicated nodes. Aside from being > unclean and wasteful, this makes the generated osm file non-routable. > Well, today I decided to do something about it. Looking at the source, > I noticed quite a bit of complicated code to "merge" identical OSM > nodes at junctions (trusting geobase to provide the right hints of > where junctions are). Evidently this wasn't working quite as expected.
> > I decided to try something simpler: just keep a hashtable of node > "keys" (latitude and longitude) as you go along, and reuse nodes that > have occurred before. :) At first glance, this approach seems to work > fairly well. It seems intuitively "right" to me that if two ways have > a common lat/lng point in common, they should be connected. A quick > count of running the two algorithms on Edmonton, Alberta, reveals that > where the old script resulted in 77242 (!) overlapping nodes > (sometimes up to 8 on a single lat/lng position!), my code resulted in > 0. Is this because the geobase data doesn't define Junction objects for those positions or some other reasons? Can you post a few examples of these? I wasn't aware that recent versions of the script were still doing this. (I probably won't get to look into the details for a few days though) If the junction data turns out to be incomplete then maybe we are better merging all common lat/lon pairs. > > Incidentally, this technique (if it's a good one) could be applied in > automatic fashion on existing portions of the geobase import, > eliminating the need for tedious manual work. What does your hash do to the memory usage for the script. On the systems I run it on RAM tends to be a more limiting factor over other things. > > Anyway, I don't have commit access to the OpenStreetMap subversion > repository holding geobase2osm, so I decided to fork the repository > using git and put up the result here: > > http://github.com/wlach/geobase2osm > > Oh, I also patched it up to not do coordinate transforms between the > NAD83 and WGS84 systems, as apparently > (http://sci.tech-archive.net/Archive/sci.geo.satellite-nav/2006-09/msg00307.html) > there is no difference when it comes to positioning. Empirically, this > seems to be true: the output without this transform seems 100% fine. > Dropping these transforms allows us to drop the osgeo dependancy, > which makes the whole thing run on OpenSUSE in the first place. According to that link the there is a 1.5m offset between the two coordinate systems. That doesn't seem insignificant, maybe someone with a stronger GIS background can comment further on how important the correction is. > > More exciting geobase2osm work coming later. Maybe. > > Questions/comments? Let me know! > > -- > William Lachance > [email protected] > > _______________________________________________ > Talk-ca mailing list > [email protected] > http://lists.openstreetmap.org/listinfo/talk-ca > _______________________________________________ Talk-ca mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-ca

