I am using a query like this to try and normalize a table.

WITH nd as (select * from sales order by id limit 100),

people_update as (update people p set first_name = nd.first_name from
nd where p.email = nd.email returning nd.id),

insert into  people (first_name, email, created_at, updated_at)
                      select first_name, email , now(), now()
                      from nd
                      left join people_update using(id) where
people_update.id is null),

This works pretty good except for when the top 100 records have
duplicated email address (two sales for the same email address).

I am wondering what the best strategy is for dealing with this
scenario.  Doing the records one at a time would work but obviously it
would be much slower.  There are no other columns I can rely on to
make the record more unique either.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to