Hi Tom,

Possible example:

DROP TABLE IF EXISTS #TestData;
GO

CREATE TABLE #TestData
(
    ID int NOT NULL,
    [Name] varchar(20) NOT NULL,
    [Description] varchar(30) NOT NULL,
    SomeDate date NOT NULL,
    Etc varchar(20) NOT NULL
);
GO

INSERT #TestData
(
    ID, [Name], [Description], SomeDate, Etc
)
SELECT ID, [Name], [Description], SomeDate, Etc
FROM (VALUES
      (1,'abc','abc abc','20221117','a'),
      (2,'abc','abc abc','20221117','a'),
      (5,'def','def def','20221117','a'),
      (4,'abc','abc abc','20221117','a'),
      (3,'def','def def','20221117','a'),
      (6,'xyz','def def','20221117','a')
     ) AS v(ID, [Name], [Description], SomeDate, Etc);
GO

SELECT * FROM #TestData;

WITH OrderedRows
AS
(
    SELECT ID, [Name], [Description], SomeDate, Etc,
           ROW_NUMBER() OVER(PARTITION BY [Name], [Description], SomeDate, Etc 
ORDER BY ID DESC) AS CopyNumber
    FROM #TestData
)
SELECT ID, [Name], [Description], SomeDate, Etc
FROM OrderedRows
WHERE CopyNumber > 1
ORDER BY ID;

DROP TABLE IF EXISTS #TestData;

Regards,

Greg

Dr Greg Low

1300SQLSQL (1300 775 775) office | +61 419201410 mobile
SQL Down Under | Web: https://sqldownunder.com<https://sqldownunder.com/> | 
About Greg:  https://about.me/greg.low

From: Tom P via ozdotnet <ozdotnet@ozdotnet.com>
Sent: Thursday, 17 November 2022 5:01 PM
To: ozDotNet <ozdotnet@ozdotnet.com>
Cc: Tom P <tompbi...@gmail.com>
Subject: [OT] Finding duplicate rows in Sql Server

Apologies if this is basic for probably most of you but I just can't get my 
head around it.

I have a flat table in sql server which contains lots of duplicates, differing 
only by one column.

Id,Name,Desc,Date,Etc
1,abc,abc abc,2022-11-17,a
2,abc,abc abc,2022-11-17,a
5,def,def def,2022-11-17,a
4,abc,abc abc,2022-11-17,a
3,def,def def,2022-11-17,a
6,xyz,def def,2022-11-17,a

I'm trying to write a query that finds all duplicates excluding the ones with 
the highest Id. So for the above example it would return the following:

Id,Name,Desc,Date,Etc
1,abc,abc abc,2022-11-17,a
2,abc,abc abc,2022-11-17,a
3,def,def def,2022-11-17,a

There are many millions of rows to process so looking for something efficient. 
Any advice would be appreciated.

Regards
Tom

Reply via email to