Of course, the first question one might ask is why not try first to prevent
the duplicates. The annoyance of users pressing the enter key before an
edit/insert is completed is of course a long-standing problem for web
developers. There are many articles in various magazines offering ideas
about how to solve it.

One simple approach is to simply detect before doing the insert if the
insert about to be done would create a duplicate. Stop the problem before it
happens. It's not as elegant as some others, that involve putting fields on
forms and testing the state of the form versus the database, but it will
work--and should be less bother (and perhaps use less resources) than trying
to handle it in the reporting.

But as for finding duplicates, here's one example of how I might find
records that have duplicate passwords in a table of contacts, showing the
names and passwords of folks with duplicate passwords.

SELECT password, ContactName
FROM Contacts
WHERE password In (SELECT password FROM Contacts As Tmp GROUP BY password
HAVING Count(*)>1)
ORDER BY password

I haven't tested to see if it's optimal code, and how it would work against
a large database, but it works.

In fact, some may recognize the code as coming from MS Access and its
built-in "find duplicates query wizard", reached with the menu command
Insert>Query in the Access interface.  Let's not start an Access bashing
thread. I only point out this wizard because it's a little-used feature in
Access that can be helpful regardless of the database you're using, since
the code it creates is reasonably standard across dbms's. I only had to
remove some redundant parentheses and Access' use of square brackets around
some column names. Many people have the tool installed even if they don't
use it as their DBMS for a web app. It's just nice to use when you forget
how to do a find duplicate query.

There are two other possibly useful query wizards offered there, as well:
crosstabs (which are worth learning about if you've never heard of them) and
"find unmatched", which will find records in one table that have no related
records in another. It just does a left join and finds records where a
foreign key is null in one table, but for folks who don't do left joins
often but need to solve the problem, it's a handy reminder.

/charlie

-----Original Message-----
From: Ben Densmore [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 03, 2002 10:37 AM
To: SQL
Subject: Stuck on Query


Hi Everyone,
 I have a DB with about 500,000 records in it. This DB is from a sales
lead form that is on our website. Occasionally we have customers submit
the form twice or even more than that. I want to only display one of
these leads to our sales dept if there is a duplicate, or I want to
delete all the duplicates in there. I have tried several things but
haevn't had as much success as I would like, I was using a query that
would select just the top 1 id if there were duplicates which worked but
it slowed down the query dramatically. I'm using CF to display my data
and at times I would get an error that the process was deadlocked. Can
someone give me some ideas on the best way to either display only one
record if there are duplicates that won't cut performance time down or
how I can delete all but one of the duplicates?

Thank You,
Ben Densmore



______________________________________________________________________
Macromedia ColdFusion 5 Training from the Source
  Step by Step ColdFusion
  http://www.amazon.com/exec/obidos/ASIN/0201758474/houseoffusion
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

Reply via email to