On 2018/10/11 5:59 PM, Dominique Devienne wrote:
I can find duplicates fine:

select xmd.partId, parts.title, xmd.name,
        count(*) "#dupplicates",
        group_concat(xmd.value) "values",
        group_concat(xmd.idx) "indexes"
   from extra_meta_data xmd
   join parts on parts.id = xmd.partId
   group by partId, name
  having "#dupplicates" > 1;

but most actual duplicates have the same value, so are harmless.
so I'd like to select only the xmd.name rows which have differing values.

Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
enough I think.
Any hints on how to go about this problem please? Thanks, --DD

2 Ways to solve this that comes to mind initially...

Option 1 - Self join on the Index field and dissimilar second field, and group, Option 2 - Use the fancy new Window functions to distill the real duplicates from the don't-matter duplicates and then count the non-distinct items.
These two methods shown below:


  -- SQLite version 3.25.1  [ Release: 2018-09-18 ]  on SQLitespeed version 2.1.1.37.   -- ================================================================================================

CREATE TABLE t(a,b);

INSERT INTO t(a,b) VALUES
(1,300),
(1,100),  -- This is a real duplicate
(1,300),
(1,300),
(2,500),
(2,500),  -- This duplicate does not matter
(3,400),
(3,500),  -- This is again a real duplicate
(3,400)
;
  -- This means we need to be told about Index 1 and 3 which contain real duplicates.


-- Option 1:
SELECT DISTINCT t1.a FROM t AS t1 JOIN t AS t2 ON t2.a=t1.a AND t2.b<>t1.b
;

  --       a
  -- ------------
  --       1
  --       3

-- Option 2:
SELECT a FROM (SELECT DISTINCT a,MAX(b) OVER (PARTITION BY a,b) FROM t) GROUP BY a HAVING COUNT(*)>1
;

  --       a
  -- ------------
  --       1
  --       3


HTH,
Ryan

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to