Re: [OT] Finding duplicate rows in Sql Server

2022-11-17 Thread Tony McGee
Also be aware that there's a gotcha when using WHERE abc NOT IN (SELECT 
xyz FROM ...) that has bitten me a few times in the past... 
Perhaps not an issue in this scenario because it's an ID and likely NOT 
NULL, but if the select subquery ever returns a NULL value for xyz the 
condition evaluates to false and may give you unexpected results.  WHERE 
abc IN (SELECT xyz FROM ...) doesn't have the same problem, it 
effectively ignores the NULLs.


cheers,
Tony


On 17/11/2022 16:24, Alan Ingleby via ozdotnet wrote:

If the ID is unique across all records,

SELECT * FROM  WHERE ID NOT IN (SELECT MAX(ID) FROM 
GROUP BY NAme,Desc,Date,Etc)


On Thu, 17 Nov 2022 at 16:02, Tom P via ozdotnet 
 wrote:


Apologies if this is basic for probably most of you but I just
can't get my head around it.

I have a flat table in sql server which contains lots of
duplicates, differing only by one column.

Id,Name,Desc,Date,Etc
1,abc,abc abc,2022-11-17,a
2,abc,abc abc,2022-11-17,a
5,def,def def,2022-11-17,a
4,abc,abc abc,2022-11-17,a
3,def,def def,2022-11-17,a
6,xyz,def def,2022-11-17,a

I'm trying to write a query that finds all duplicates _excluding
the ones with the highest Id_. So for the above example it would
return the following:

Id,Name,Desc,Date,Etc
1,abc,abc abc,2022-11-17,a
2,abc,abc abc,2022-11-17,a
3,def,def def,2022-11-17,a

There are many millions of rows to process so looking for
something efficient. Any advice would be appreciated.

Regards
Tom

-- 
ozdotnet mailing list

To manage your subscription, access archives:
https://codify.mailman3.com/ 




--
Alan Ingleby



Re: [OT] Finding duplicate rows in Sql Server

2022-11-17 Thread Tom P
Yikes looks so simple once the answer is here haha. I’ll check in the
morning if the Id is unique across the board. Thanks I appreciate the help.

Regards
Tom

On Thu, 17 Nov 2022 at 17:27, Alan Ingleby via ozdotnet <
ozdotnet@ozdotnet.com> wrote:

> If the ID is unique across all records,
>
> SELECT * FROM  WHERE ID NOT IN (SELECT MAX(ID) FROM
> GROUP BY NAme,Desc,Date,Etc)
>
> On Thu, 17 Nov 2022 at 16:02, Tom P via ozdotnet 
> wrote:
>
>> Apologies if this is basic for probably most of you but I just can't get
>> my head around it.
>>
>> I have a flat table in sql server which contains lots of duplicates,
>> differing only by one column.
>>
>> Id,Name,Desc,Date,Etc
>> 1,abc,abc abc,2022-11-17,a
>> 2,abc,abc abc,2022-11-17,a
>> 5,def,def def,2022-11-17,a
>> 4,abc,abc abc,2022-11-17,a
>> 3,def,def def,2022-11-17,a
>> 6,xyz,def def,2022-11-17,a
>>
>> I'm trying to write a query that finds all duplicates *excluding the
>> ones with the highest Id*. So for the above example it would return the
>> following:
>>
>> Id,Name,Desc,Date,Etc
>> 1,abc,abc abc,2022-11-17,a
>> 2,abc,abc abc,2022-11-17,a
>> 3,def,def def,2022-11-17,a
>>
>> There are many millions of rows to process so looking for something
>> efficient. Any advice would be appreciated.
>>
>> Regards
>> Tom
>>
>> --
>> ozdotnet mailing list
>> To manage your subscription, access archives:
>> https://codify.mailman3.com/
>
>
>
> --
> Alan Ingleby
> --
> ozdotnet mailing list
> To manage your subscription, access archives: https://codify.mailman3.com/

-- 
Thanks
Tom