Re: Enforcing uniqueness on [real estate/postal] addresses

Paul Jungwirth Mon, 11 May 2020 11:34:47 -0700

On 5/11/20 9:55 AM, Peter Devoy wrote:

Of course, if any of the fields are NULL (which they often are) I end
up with duplicates.


One solution may be to add NOT NULL constraints and use empty strings
instead of NULL values but, until asking around today, I thought this was
generally considered bad practice.

If you don't want to store empty strings (which I agree is a littleyucky), you could replace NULLs with an empty string *only when checkingfor uniqueness*. To do this, first replace your unique constraint with aunique index, which gives you some additional features (e.g. indexingexpressions, indexing only part of the table with a WHERE clause,building it concurrently, etc.). In this case we only care aboutindexing expressions.


So you can say:

    CREATE UNIQUE INDEX is_unique_address ON properties (
      COALESCE(description, ''),
      COALESCE(address_identifier_general, ''),
      COALESCE(street, ''),
      COALESCE(postcode, ''));

Another approach, which I don't think is really a serious suggestion butis sort of interesting to think about: you could define an operator, say===, that does the same thing as `IS NOT DISTINCT FROM`. Then you couldcreate an exclusion constraint using that operator on all four columns.I've never tried that before but it seems like it would work.

Maybe that's too much effort for something like this. I just think it'sinteresting because it feels like a use case for exclusion constraintsthat goes in the "opposite direction" of how they are usually used:instead of being less restrictive than =, it is more restrictive.


Regards,

--
Paul              ~{:-)
[email protected]

Re: Enforcing uniqueness on [real estate/postal] addresses

Reply via email to