On May 29, 2007, at 14:50 , Oliver Elphick wrote:

On Tue, 2007-05-29 at 13:49 -0500, Michael Glaesemann wrote:

If you're handling more than one country, you'll most likely want to
associate the states with their respective countries.

-- Listing 4
CREATE TABLE countries
(
     country_id INTEGER PRIMARY KEY
);

CREATE TABLE states
(
     state_id INTEGER PRIMARY KEY
     , state_name TEXT NOT NULL
     , country_id INTEGER NOT NULL
         REFERENCES countries (country_id)
);

Note that there's no UNIQUE constraint on state_name. You may have
more than one state with the same state_name around the world so you
may want to make sure that for each country, each state_name is
unique:

You have assumed that state codes are unique integers, but for a
worldwide database that is probably a bad design.

Actually, my intent was to use state_id as a surrogate key for state_name. I assumed unique state_names per country. If one wanted state codes, (such as ISO 3166-2), you'd add columns for that.

Furthermore, these codes are
not going to be unique. For instance MH is the US abbreviation for the
Marshall Islands [US Post Office] and also the Indian abbreviation for
Maharashtra [Wikipedia]. In such a case I would always make the country
code part of the primary key and not just an attribute.  Again this
saves your having to invent a new set of codes when one exists already.

This trends towards the discussion on whether or not to use surrogate keys. In the schema I suggested, the natural key is (state_name, country_id). If you wanted to use ISO 3166-1 codes in both countries and states tables rather than the country_id surrogate key, that'd work, too. Surrogate keys are tangential to the normalization issues I was addressing.

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to