Hi,

Thank you.

In that case, is there a way to create a table which contains only the
records that have duplicate F1 values?

Thanks



> -----Original Message-----
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of R Smith
> Sent: Friday, February 19, 2016 5:41 PM
> To: sqlite-users at mailinglists.sqlite.org
> Subject: Re: [sqlite] Process duplicate field values
> 
> 
> 
> On 2016/02/19 8:00 AM, admin at shuling.net wrote:
> > Hi,
> >
> >
> >
> > I create a table as follows:
> >
> >
> >
> > CREATE TABLE MyTable (F1 INTEGER, F2 INTEGER);
> >
> >
> >
> > Then add the following records:
> >
> >
> >
> > INSERT INTO MyTable (F1, F2) Values (1, 2);
> >
> > INSERT INTO MyTable (F1, F2) Values (1, 3);
> >
> > INSERT INTO MyTable (F1, F2) Values (2, 4);
> >
> > INSERT INTO MyTable (F1, F2) Values (2, 5);
> >
> > INSERT INTO MyTable (F1, F2) Values (3, 6);
> >
> > INSERT INTO MyTable (F1, F2) Values (3, 7);
> >
> > INSERT INTO MyTable (F1, F2) Values (4, 2);
> >
> >
> >
> > Now if two records have the same value of F1, then I will define them
> > as conflict records.
> >
> >
> >
> > Now I need to perform the following tasks:
> >
> >
> >
> >      1. For all conflict records, get the total count of distinct F1
values.
> > In the above sample, record 1, 2, 3, 4, 5, 6 are conflict records, but
> > the distinct values are only 1, 2, 3 so the total count should be 3.
> >      2. Get the total count of all the conflict records. In the above
> > sample, it should be 6.
> >      3. Set the F2 value of all the conflict records to 9. Keep all
> > other records intact.
> >
> >
> >
> > How to do that? Can task 2 and 3 be implemented in one SQL query to
> > improve the performance?
> 
> I think you meant to use the word "Duplicate" where you said "Distinct".
> You need the number of Duplicate F1 records, which is 3.
> 
> Then you need the total count of records that might be duplicated for F1,
> which is 6 in the above case.
> 
> This all is easy and Hick's solution will work.
> 
> As to the question of doing 2 and 3 together, how can that ever be? 3 does
> an update, and 2 expects a return value. There is no mix and match updates
> and selects (Though Postgres has a great way of doing a bit of
> both) - but apart from some convenience to the query creator, it has no
real
> efficiency advantage, which is why it's not really done.
> 
> Best efficiency would be a temp table to avoid multiple walking of the
original
> table and be very fast for even large datasets - I would advise this way
> perhaps:
> 
> 
> ------------------------------------------------
>    -- Processing SQL in: E:\Documents\SQLiteAutoScript.sql
>    -- SQLite version 3.9.2  [ Release: 2015-11-02 ]  on SQLitespeed
version
> 2.0.2.4.
> 
>    -- Script Items: 10         Parameter Count: 0
>    -- 2016-02-19 11:35:58.596  |  [Info]       Script Initialized,
> Started executing...
>    --
> ==========================================================
> ======================================
> 
> 
> CREATE TABLE MyTable (F1 INTEGER, F2 INTEGER);
> 
> INSERT INTO MyTable (F1, F2) Values
>   (1, 2)
> ,(1, 3)
> ,(2, 4)
> ,(2, 5)
> ,(3, 6)
> ,(3, 7)
> ,(4, 2)
> ;
> 
> CREATE TEMP TABLE F1Dups (F1_ID INTEGER PRIMARY KEY, F1_Count INT);
> 
> INSERT INTO F1Dups SELECT DISTINCT B.F1, 0
>    FROM MyTable AS A
>   INNER JOIN MyTable AS B ON B.F1 = A.F1
>   WHERE  B.rowid <> A.rowid;
> 
> UPDATE F1Dups SET F1_Count = (SELECT COUNT(*) FROM MyTable WHERE
> MyTable.F1=F1Dups.F1_ID);
> 
> 
> SELECT COUNT(*) AS Dist_Confl FROM F1Dups;
> 
>    --  Dist_Confl
>    -- ------------
>    --       3
> 
> SELECT SUM(F1_Count) AS All_Conf FROM F1Dups;
> 
>    --   All_Conf
>    -- ------------
>    --       6
> 
> 
> UPDATE MyTable SET F2=9 WHERE F1 IN (SELECT F1_ID FROM F1Dups);
> 
> SELECT * FROM MyTable;
> 
>    --  F1 |  F2
>    -- --- | ---
>    --  1  |  9
>    --  1  |  9
>    --  2  |  9
>    --  2  |  9
>    --  3  |  9
>    --  3  |  9
>    --  4  |  2
> 
> DROP TABLE F1Dups;
> 
> 
>    --   Script Stats: Total Script Execution Time:     0d 00h 00m and
> 00.031s
>    --                 Total Script Query Time:         -- --- --- ---
> --.----
>    --                 Total Database Rows Changed:     19
>    --                 Total Virtual-Machine Steps:     717
>    --                 Last executed Item Index:        10
>    --                 Last Script Error:
>    --
>
----------------------------------------------------------------------------
------------------
> --
> 
> 
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to