Steve, > > I don't see where the topology issue is relevant here. > Everything should be able to be done within SQL without > reliance upon topology, when TIGER's primary keys are used. >
The only reason on my radar for wanting a normalized tiger dataset is for breaking edges at the local government level and to keep track of what edges were edited and so forth as well as fitting my existing geometries within the topology of TIGER. Your vision of normalization doesn't satisfy that need because a geometry is inherently a denormalized vision of a normalized topology. > Perhaps the geocoder as is has gone too far down the path > you've described to be easily adapted? > What benefit are you striving to get out of this aside? It would help if I understood that more clearly. I can forsee it would be faster if TLID and tfid were made primary keys, but I dismissed it for the time being since I didn't have time to benchmark the difference and also write routines to fix data of people who have already loaded their data. As far as making tlid a primary key. On further inspection of our loader, I realized we load into staging tables anyway so that we could rename all the fields that TIGER found the need to tack years on. So as part of that routine we drop the generated gid anyway, rename some others, add others, and drop others. So that probably wouldn't take too much effort. The easiest would be to use an EXCEPT clause since each county is loaded into staging and then added to the core state table. It would probably be slower than an append skip, but append skip is not an option since the data is already in the database by the time we get to that point. It won't change the queries though because we would still need the state joins to take advantage of constraint exclusion unless you have other thoughts on that. So that alone violates perfect normalization at the TLID level. Again keep in mind our priorities are: 1) Build a geocoder that is easy for everyone to load and use on any platform PostGIS runs on 2) User Maintainability - that means being able to drop state tables etc and load newer ones piecemeal or migrate a set of states to another database without having to change the underlying code. 3) Speed 4) Developer Maintainability -- I put this as last because it's too open to debate how you should structure your code so easy to some is not necessarily easy to others without knowing the developers invovled. For example I would never use a NATURAL JOIN because its too prone to all your code breaking if you happen to have two columns with same name but different meeting, but that's just me. Thanks, Regina http://www.postgis.us _______________________________________________ postgis-users mailing list [email protected] http://postgis.refractions.net/mailman/listinfo/postgis-users
