On Thu, Mar 29, 2012 at 9:10 AM, Josh Doe <[email protected]> wrote: > On Thu, Mar 29, 2012 at 9:45 AM, Ian Dees <[email protected]> wrote: > > After loading Cook County TIGER road features and OSM linear features > into > > PostGIS, I ran a simple query to find how well the roads matched: > > > > SELECT a.name, b.fullname, ST_HausdorffDistance(a.geom, b.geom) as dist > > FROM cook_tiger a, cook_osm b > > WHERE (a.geom && b.geom) AND ST_HausdorffDistance(a.geom, b.geom) < > > 0.0005 > > LIMIT 50 > > > > This returned results that made sense (the names matched in all 50 > results). > > > > I removed the LIMIT clause and let it run before going to work to see how > > many of the TIGER records match existing OSM features. > > > > Next up is building a table of TIGER -> OSM matches and using that to > find > > TIGER rows that don't have a corresponding OSM feature. > > > > If anyone has any ideas for speeding this up I'd love to hear it. It took > > well over a couple hours to run one county. There are a lot of counties > in > > the US. > > Very cool! To speed this up perhaps try limiting the number of times > ST_HausdorffDistance is executed. First only run it for ways which are > "close", such as falling inside a buffer, or even faster inside a > bounding box. For a trivial speedup generate a table with distances > first, then use the WHERE clause. However I have no idea how to form > such queries!
The bounds overlap check (a.geom && b.geom) speeds things up drastically, but because Cook County contains Chicago (which is very road-dense), I imagine there are tons of HausdorffDistance calls that don't need to happen. If I thought I was going to run this tons of times I could generate a table of all possible hausdorff distances, but there would be a lot of rows (if I remember my high school stats, it would be len(cook_tiger) * len(cook_osm) rows). I may try switching to one of PostGIS's "overlap" or "touching" calls to limit the number of calls even more, but I think I'd miss lots of possible matches that way (if the roads are offset enough to not ever touch).
_______________________________________________ Talk-us mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-us

