--- Klint Gore <[EMAIL PROTECTED]> wrote:

> On Wed, 26 Jan 2005 21:25:19 -0500, "D. Richard Hipp"
> <[EMAIL PROTECTED]> wrote:
> > On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen
> wrote:
> > > I'm new to SQL, and SQLite, and I find myself needing
> to
> > > identify duplicate records in a SQLite table (there
> are
> > > about 2K duplicates in a 36K row table).  Any
> suggestions
> > > or magic SQL queries appreciated :-)
> > > 
> > 
> > SELECT * FROM table EXCEPT SELECT DISTINCT * FROM
> table;
> 
> Does that actually work?
> 
> sqlite> create table firstnames (fname varchar(5));
> sqlite> insert into firstnames values ('bob');
> sqlite> insert into firstnames values ('bob');
> sqlite> insert into firstnames values ('bill');
> sqlite> select * from firstnames except select distinct *
> from firstnames;
> sqlite>
> sqlite> select fname from firstnames group by fname
> having count(*) > 1;
> bob
> sqlite>
> 
> klint.
> 
>
+---------------------------------------+-----------------+
> : Klint Gore                            : "Non rhyming   
> :
> : EMail   : [EMAIL PROTECTED]           :  slang - the   
> :
> : Snail   : A.B.R.I.                    :  possibilities 
> :
> : Mail      University of New England   :  are useless"  
> :
> :           Armidale NSW 2351 Australia :     L.J.J.     
> :
> : Fax     : +61 2 6772 5376             :                
> :
>
+---------------------------------------+-----------------+
> 

Many thanks to all who replied with help.  As has been
pointed out, the EXCEPT method wasn't successful :-( 
Seemed almost too good to be true.

I managed to find and eliminate the duplicates using a
combination of the GROUP BY method Klint suggested, and a
temp table.

While possibly not clear without having the schema for
table DMI_SKU, here's what I finally was able to do after
lots of trial and error (gotta love transactions).

create temp table TEMP_DUPES (MPN text, DMI text, MDESC
text, RID integer, DCOUNT integer);
-- Get all duplicates that are dupes by mpn, dmi, and
modeldesc into temp table
begin transaction;
insert into temp_dupes
        select ModelPartNumber MPN, dmistring DMI, modeldesc
MDESC, rowid RID, count(1) DCOUNT
          from dmi_sku
        group by MPN, DMI, MDESC
        HAVING count(*) >1;

--examine all duplicates
select * from dmi_sku where modelpartnumber in (select mpn
from temp_dupes);

delete from dmi_sku
where
        modelpartnumber in (select mpn from temp_dupes) and
        rowid not in (select rid from temp_dupes);

--inspect the results again
select ModelPartNumber, ModelDesc, DMIString, rowid
  from dmi_sku
  where modelpartnumber in (select mpn from temp_dupes);

--and, of course...
commit;

Fortunately, this is a one-time build-up of a dataset for
another project, so fast, elegant code wasn't necessary.

Also, many thanks to DRH for an excellent product in
SQLite.

 -Clark

Reply via email to