2009/6/26 Steve Singer <[email protected]>: > On Fri, 26 Jun 2009, William Lachance wrote: > >> I decided to try something simpler: just keep a hashtable of node >> "keys" (latitude and longitude) as you go along, and reuse nodes that >> have occurred before. :) At first glance, this approach seems to work >> fairly well. It seems intuitively "right" to me that if two ways have >> a common lat/lng point in common, they should be connected. A quick >> count of running the two algorithms on Edmonton, Alberta, reveals that >> where the old script resulted in 77242 (!) overlapping nodes >> (sometimes up to 8 on a single lat/lng position!), my code resulted in >> 0. > > Is this because the geobase data doesn't define Junction objects for those > positions or some other reasons? Can you post a few examples of these? > I wasn't aware that recent versions of the script were still doing this. (I > probably won't get to look into the details for a few days though)
I'm not sure exactly why, I'd need to check. > If the junction data turns out to be incomplete then maybe we are better > merging all common lat/lon pairs. Is there any legitimate reason to have two nodes at the same position? >> Incidentally, this technique (if it's a good one) could be applied in >> automatic fashion on existing portions of the geobase import, >> eliminating the need for tedious manual work. > > What does your hash do to the memory usage for the script. On the systems I > run it on RAM tends to be a more limiting factor over other things. You're basically taking the additional hit of the size of storing a key (about 24 characters) for each node. So multiply that by the number of nodes you're using. So not an enormous amount of extra RAM, possibly less overall if there's a lot of duplication in the dataset. It seems like geobase2osm could generally be much more efficient in the way it uses memory. I can take a closer look at that if this sort of thing is bothering people. >> Oh, I also patched it up to not do coordinate transforms between the >> NAD83 and WGS84 systems, as apparently >> >> (http://sci.tech-archive.net/Archive/sci.geo.satellite-nav/2006-09/msg00307.html) >> there is no difference when it comes to positioning. Empirically, this >> seems to be true: the output without this transform seems 100% fine. >> Dropping these transforms allows us to drop the osgeo dependancy, >> which makes the whole thing run on OpenSUSE in the first place. > > > According to that link the there is a 1.5m offset between the two coordinate > systems. That doesn't seem insignificant, maybe someone with a stronger GIS > background can comment further on how important the correction is. Yes, but if you follow the conversation it sounds like transforming between one coordinate system and another is a very location specific thing and that there's no general transform that is guaranteed to get you improved accuracy. I just tested the basis of the osgeo transform code (http://trac.osgeo.org/proj/) on my mac, and doing "transform" between nad83 and wgs84 on some sample data in Halifax (using the cs2cs tool) produced exactly the same output. Anyway, my background is definitely not coordinate system transformations in GIS systems, so an expert opinion would indeed be welcome. -- William Lachance [email protected] _______________________________________________ Talk-ca mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-ca

