subject:"\[sqlite\] Finding duplicate records"

Re: [sqlite] finding duplicate records i.e. records with same values across 4 colums...

2012-11-08 Thread Simon Davies

On 8 November 2012 11:23, e-mail mgbg25171  wrote:
> I'm using SQLIte.
> All columns can have duplicate values but it would be helpful to report
> on those rows which are identical across all columns.
> More specifically I'm looking for matching itm values where the first 3
> cols ALSO match but am not sure of the sqlite select query to do this.
>
> epic, yr, statement, itm
> ==
> mcro, 2002, income, revs
> mcro, 2002, income, cogs
> mcro, 2002, income, sg&a
> mcro, 2002, income, cogs
> mcro, 2003, balance, gdwil
> etc
>
> in the example I'm looking for a query that would return
> mcro, 2002, income, cogs
> mcro, 2002, income, cogs

select * from t natural join ( select epic, yr, statement, itm from t
group by epic, yr, statement, itm having count(*) > 1 ) t1;

>
> any help much appreciated
>
> BTW no field can be unique in the createtable statement


Regards,
Simon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] finding duplicate records i.e. records with same values across 4 colums...

2012-11-08 Thread e-mail mgbg25171

I'm using SQLIte.
All columns can have duplicate values but it would be helpful to report
on those rows which are identical across all columns.
More specifically I'm looking for matching itm values where the first 3
cols ALSO match but am not sure of the sqlite select query to do this.

epic, yr, statement, itm
==
mcro, 2002, income, revs
mcro, 2002, income, cogs
mcro, 2002, income, sg&a
mcro, 2002, income, cogs
mcro, 2003, balance, gdwil
etc

in the example I'm looking for a query that would return
mcro, 2002, income, cogs
mcro, 2002, income, cogs

any help much appreciated

BTW no field can be unique in the createtable statement
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Finding duplicate records

2005-02-05 Thread Stefano Barbato

On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen wrote:
 > Hello all,
 >
 > I'm new to SQL, and SQLite, and I find myself needing to
 > identify duplicate records in a SQLite table (there are
 > about 2K duplicates in a 36K row table).  Any suggestions
 > or magic SQL queries appreciated :-)
 >
I've used these queries to delete rows with the same GROUPING_FIELDS:
1) keep first inserted item:
delete from TABLE where rowid not in (select min(rowid) from TABLE
group by GROUPING_FIELDS);
2) keep last inserted item:
delete from TABLE where rowid not in (select max(rowid) from TABLE
group by GROUPING_FIELDS);
they *seem* to work as expected
bye,
stefano

Re: [sqlite] Finding duplicate records

2005-02-05 Thread Stefano Barbato

On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen wrote:
> Hello all,
>
> I'm new to SQL, and SQLite, and I find myself needing to
> identify duplicate records in a SQLite table (there are
> about 2K duplicates in a 36K row table).  Any suggestions
> or magic SQL queries appreciated :-)
>
I've used these queries to delete rows with the same GROUPING_FIELDS:
1) keep first inserted item:
delete from TABLE where rowid not in (select min(rowid) from TABLE 
group by GROUPING_FIELDS);

2) keep last inserted item:
delete from TABLE where rowid not in (select max(rowid) from TABLE 
group by GROUPING_FIELDS);

they *seem* to work as expected
bye,
stefano

Re: [sqlite] Finding duplicate records

2005-01-27 Thread Clark Christensen


--- Brass Tilde <[EMAIL PROTECTED]> wrote:

> > I'm new to SQL, and SQLite, and I find myself needing
> to
> > identify duplicate records in a SQLite table (there are
> > about 2K duplicates in a 36K row table).  Any
> suggestions
> > or magic SQL queries appreciated :-)
> 
> Here are a couple of strategies I've used successfully
> for finding and
> potentially removing duplicate records.  It assumes a
> primary key field that
> is *separate* from the fields being checked for
> duplication.  This example
> also assumes that I only want to check for duplication on
> two of the fields
> in the table: field1 and field2.
> 
> Assume a table such as follows, that might included
> duplicate records.
> 
> create table checkedtable
> (
> priKeyinteger primary key,
> field1char(10),
> field2char(10),
> 
> );
> 
> and a temporary table to hold the duplicates
> 
> create temp table dups
> (
> field1,
> field2,
> reccount,
> keyvalue
> );
> 
> The first method is probably the simplest:
> 
> insert into dups (field1, field2, keyvalue)
> select field1, field2, max(priKey)
> from   checkedtable
> group by field1, field2
> 
> 
> I don't know if this one works on SQLite, as I've never
> had reason to try
> it.  (I've learned a fair bit about DB design in the last
> couple of years
> before starting with SQLite, so duplication that needs
> correction is less of
> a problem now :)  It does work on SQL Server 2000,
> though.
> 
> The second method is one that I used before I understood
> that GROUP BY could
> be used for more than counting. :-)
> 
> Get all the records that have duplicate information and
> put them into the
> temp table:
> 
> insert into dups (field1, field2, reccount)
> select field1, field2, count(priKey)
> from checkedtable
> group by field1, field2
> having count(priKey) > 1
> 
> Now, get the key for one of them:
> 
> update  dups
> set keyvalue = checkedtable.priKey
> fromcheckedtable
> where   checkedtable.field1 = checkedtable.field1
> and checkedtable.field2 = checkedtable.field2
> 
> 
> 
> Using either of these methods, the temporary table now
> contains key values
> that point to one of each set of duplicated records in
> the original table.
> It can be used to delete them, archive them, change them,
> whatever, by
> joining back to the original.
> 
> Note that the code presented here is off the top of my
> head, and not tested,
> so there may be syntactic errors that I didn't catch
> while writing it.  With
> any luck, there aren't any logic errors.  Hopefully, it
> will give you a
> starting point.
> 
> Brad
> 
> 
Basically, it worked in SQLite.  Thanks for your help.

 -Clark

Re: [sqlite] Finding duplicate records

2005-01-27 Thread Clark Christensen

--- Klint Gore <[EMAIL PROTECTED]> wrote:

> On Wed, 26 Jan 2005 21:25:19 -0500, "D. Richard Hipp"
> <[EMAIL PROTECTED]> wrote:
> > On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen
> wrote:
> > > I'm new to SQL, and SQLite, and I find myself needing
> to
> > > identify duplicate records in a SQLite table (there
> are
> > > about 2K duplicates in a 36K row table).  Any
> suggestions
> > > or magic SQL queries appreciated :-)
> > > 
> > 
> > SELECT * FROM table EXCEPT SELECT DISTINCT * FROM
> table;
> 
> Does that actually work?
> 
> sqlite> create table firstnames (fname varchar(5));
> sqlite> insert into firstnames values ('bob');
> sqlite> insert into firstnames values ('bob');
> sqlite> insert into firstnames values ('bill');
> sqlite> select * from firstnames except select distinct *
> from firstnames;
> sqlite>
> sqlite> select fname from firstnames group by fname
> having count(*) > 1;
> bob
> sqlite>
> 
> klint.
> 
>
+---+-+
> : Klint Gore: "Non rhyming   
> :
> : EMail   : [EMAIL PROTECTED]   :  slang - the   
> :
> : Snail   : A.B.R.I.:  possibilities 
> :
> : Mail  University of New England   :  are useless"  
> :
> :   Armidale NSW 2351 Australia : L.J.J. 
> :
> : Fax : +61 2 6772 5376 :
> :
>
+---+-+
> 

Many thanks to all who replied with help.  As has been
pointed out, the EXCEPT method wasn't successful :-( 
Seemed almost too good to be true.

I managed to find and eliminate the duplicates using a
combination of the GROUP BY method Klint suggested, and a
temp table.

While possibly not clear without having the schema for
table DMI_SKU, here's what I finally was able to do after
lots of trial and error (gotta love transactions).

create temp table TEMP_DUPES (MPN text, DMI text, MDESC
text, RID integer, DCOUNT integer);
-- Get all duplicates that are dupes by mpn, dmi, and
modeldesc into temp table
begin transaction;
insert into temp_dupes
select ModelPartNumber MPN, dmistring DMI, modeldesc
MDESC, rowid RID, count(1) DCOUNT
  from dmi_sku
group by MPN, DMI, MDESC
HAVING count(*) >1;

--examine all duplicates
select * from dmi_sku where modelpartnumber in (select mpn
from temp_dupes);

delete from dmi_sku
where
modelpartnumber in (select mpn from temp_dupes) and
rowid not in (select rid from temp_dupes);

--inspect the results again
select ModelPartNumber, ModelDesc, DMIString, rowid
  from dmi_sku
  where modelpartnumber in (select mpn from temp_dupes);

--and, of course...
commit;

Fortunately, this is a one-time build-up of a dataset for
another project, so fast, elegant code wasn't necessary.

Also, many thanks to DRH for an excellent product in
SQLite.

 -Clark

Re: [sqlite] Finding duplicate records

2005-01-27 Thread Brass Tilde

> I'm new to SQL, and SQLite, and I find myself needing to
> identify duplicate records in a SQLite table (there are
> about 2K duplicates in a 36K row table).  Any suggestions
> or magic SQL queries appreciated :-)

Here are a couple of strategies I've used successfully for finding and
potentially removing duplicate records.  It assumes a primary key field that
is *separate* from the fields being checked for duplication.  This example
also assumes that I only want to check for duplication on two of the fields
in the table: field1 and field2.

Assume a table such as follows, that might included duplicate records.

create table checkedtable
(
priKeyinteger primary key,
field1char(10),
field2char(10),

);

and a temporary table to hold the duplicates

create temp table dups
(
field1,
field2,
reccount,
keyvalue
);

The first method is probably the simplest:

insert into dups (field1, field2, keyvalue)
select field1, field2, max(priKey)
from   checkedtable
group by field1, field2


I don't know if this one works on SQLite, as I've never had reason to try
it.  (I've learned a fair bit about DB design in the last couple of years
before starting with SQLite, so duplication that needs correction is less of
a problem now :)  It does work on SQL Server 2000, though.

The second method is one that I used before I understood that GROUP BY could
be used for more than counting. :-)

Get all the records that have duplicate information and put them into the
temp table:

insert into dups (field1, field2, reccount)
select field1, field2, count(priKey)
from checkedtable
group by field1, field2
having count(priKey) > 1

Now, get the key for one of them:

update  dups
set keyvalue = checkedtable.priKey
fromcheckedtable
where   checkedtable.field1 = checkedtable.field1
and checkedtable.field2 = checkedtable.field2



Using either of these methods, the temporary table now contains key values
that point to one of each set of duplicated records in the original table.
It can be used to delete them, archive them, change them, whatever, by
joining back to the original.

Note that the code presented here is off the top of my head, and not tested,
so there may be syntactic errors that I didn't catch while writing it.  With
any luck, there aren't any logic errors.  Hopefully, it will give you a
starting point.

Brad

Re: [sqlite] Finding duplicate records

2005-01-26 Thread Darren Duncan

At 10:43 PM -0500 1/26/05, D. Richard Hipp wrote:
You are right - my original suggestion does not work.  I forgot
that the EXCEPT operator does a DISTINCT on the result set of
both operands.  Your approach works much better.
This could be fixed by updating SQLite to accept the standard SQL 
syntax where a "set quantifier" can be used with not only a standard 
query but a compound query.

Meaning:
  SELECT  ...
And:
  SELECT ...
  [ ]
  SELECT ...
Where:
   ::= DISTINCT|ALL
   ::= UNION|EXCEPT|INTERCEPT
This way, users can specify what behaviour they want, and get 
flexability.  it should be very simple to implement.  Since the  is optional, the default behaviour is akin to ALL for 
regular selects and DISTINCT for compound selects, as SQLite and 
other databases already do.

See SQL:2003, 7.13 "" (p351) for the definition.
A slightly modified version of your suggestion would then produce the 
desired result, a duplicates list:

SELECT * FROM table EXCEPT ALL SELECT DISTINCT * FROM table;
-- Darren Duncan

Re: [sqlite] Finding duplicate records

2005-01-26 Thread D. Richard Hipp

On Thu, 2005-01-27 at 14:40 +1100, Klint Gore wrote:
> On Wed, 26 Jan 2005 21:25:19 -0500, "D. Richard Hipp" <[EMAIL PROTECTED]> 
> wrote:
> > On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen wrote:
> > > I'm new to SQL, and SQLite, and I find myself needing to
> > > identify duplicate records in a SQLite table (there are
> > > about 2K duplicates in a 36K row table).  Any suggestions
> > > or magic SQL queries appreciated :-)
> > > 
> > 
> > SELECT * FROM table EXCEPT SELECT DISTINCT * FROM table;
> 
> Does that actually work?
> 
> sqlite> create table firstnames (fname varchar(5));
> sqlite> insert into firstnames values ('bob');
> sqlite> insert into firstnames values ('bob');
> sqlite> insert into firstnames values ('bill');
> sqlite> select * from firstnames except select distinct * from firstnames;
> sqlite>
> sqlite> select fname from firstnames group by fname having count(*) > 1;
> bob
> sqlite>
> 

You are right - my original suggestion does not work.  I forgot
that the EXCEPT operator does a DISTINCT on the result set of
both operands.  Your approach works much better.
-- 
D. Richard Hipp <[EMAIL PROTECTED]>

Re: [sqlite] Finding duplicate records

2005-01-26 Thread Klint Gore

On Wed, 26 Jan 2005 21:25:19 -0500, "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:
> On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen wrote:
> > I'm new to SQL, and SQLite, and I find myself needing to
> > identify duplicate records in a SQLite table (there are
> > about 2K duplicates in a 36K row table).  Any suggestions
> > or magic SQL queries appreciated :-)
> > 
> 
> SELECT * FROM table EXCEPT SELECT DISTINCT * FROM table;

Does that actually work?

sqlite> create table firstnames (fname varchar(5));
sqlite> insert into firstnames values ('bob');
sqlite> insert into firstnames values ('bob');
sqlite> insert into firstnames values ('bill');
sqlite> select * from firstnames except select distinct * from firstnames;
sqlite>
sqlite> select fname from firstnames group by fname having count(*) > 1;
bob
sqlite>

klint.

+---+-+
: Klint Gore: "Non rhyming:
: EMail   : [EMAIL PROTECTED]   :  slang - the:
: Snail   : A.B.R.I.:  possibilities  :
: Mail  University of New England   :  are useless"   :
:   Armidale NSW 2351 Australia : L.J.J.  :
: Fax : +61 2 6772 5376 : :
+---+-+

Re: [sqlite] Finding duplicate records

2005-01-26 Thread D. Richard Hipp

On Wed, 2005-01-26 at 17:45 -0800, Clark Christensen wrote:
> Hello all,
> 
> I'm new to SQL, and SQLite, and I find myself needing to
> identify duplicate records in a SQLite table (there are
> about 2K duplicates in a 36K row table).  Any suggestions
> or magic SQL queries appreciated :-)
> 

SELECT * FROM table EXCEPT SELECT DISTINCT * FROM table;

[sqlite] Finding duplicate records

2005-01-26 Thread Clark Christensen

Hello all,

I'm new to SQL, and SQLite, and I find myself needing to
identify duplicate records in a SQLite table (there are
about 2K duplicates in a 36K row table).  Any suggestions
or magic SQL queries appreciated :-)

Thanks!

 -Clark

Re: [sqlite] finding duplicate records i.e. records with same values across 4 colums...

[sqlite] finding duplicate records i.e. records with same values across 4 colums...

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

Re: [sqlite] Finding duplicate records

[sqlite] Finding duplicate records

12 matches

Site Navigation

Mail list logo

Footer information