On 2018/10/11 5:59 PM, Dominique Devienne wrote:
I can find duplicates fine:
select xmd.partId, parts.title, xmd.name,
count(*) "#dupplicates",
group_concat(xmd.value) "values",
group_concat(xmd.idx) "indexes"
from extra_meta_data xmd
join parts on parts.id = xmd.partId
group by partId, name
having "#dupplicates" > 1;
but most actual duplicates have the same value, so are harmless.
so I'd like to select only the xmd.name rows which have differing values.
Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
enough I think.
Any hints on how to go about this problem please? Thanks, --DD
2 Ways to solve this that comes to mind initially...
Option 1 - Self join on the Index field and dissimilar second field, and
group,
Option 2 - Use the fancy new Window functions to distill the real
duplicates from the don't-matter duplicates and then count the
non-distinct items.
These two methods shown below:
-- SQLite version 3.25.1 [ Release: 2018-09-18 ] on SQLitespeed
version 2.1.1.37.
--
================================================================================================
CREATE TABLE t(a,b);
INSERT INTO t(a,b) VALUES
(1,300),
(1,100), -- This is a real duplicate
(1,300),
(1,300),
(2,500),
(2,500), -- This duplicate does not matter
(3,400),
(3,500), -- This is again a real duplicate
(3,400)
;
-- This means we need to be told about Index 1 and 3 which contain
real duplicates.
-- Option 1:
SELECT DISTINCT t1.a FROM t AS t1 JOIN t AS t2 ON t2.a=t1.a AND t2.b<>t1.b
;
-- a
-- ------------
-- 1
-- 3
-- Option 2:
SELECT a FROM (SELECT DISTINCT a,MAX(b) OVER (PARTITION BY a,b) FROM t)
GROUP BY a HAVING COUNT(*)>1
;
-- a
-- ------------
-- 1
-- 3
HTH,
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users