On Feb 11, 2013, at 2:23, Tim Uckun <timuc...@gmail.com> wrote:

> This works pretty good except for when the top 100 records have
> duplicated email address (two sales for the same email address).
> 
> I am wondering what the best strategy is for dealing with this
> scenario.  Doing the records one at a time would work but obviously it
> would be much slower.  There are no other columns I can rely on to
> make the record more unique either.

The best strategy is fixing your data-model so that you have a unique key. As 
you found out already, e-mail addresses aren't very suitable as unique keys for 
people. For this particular case I'd suggest adding a surrogate key.

Alternatively, you might try using (first_name, email) as your key. You'll 
probably still get some duplicates, but they should be less and perhaps few 
enough for your case.


Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to