On 2016/06/28 10:52 PM, j...@fenway.com wrote:
Thank you, Ryan. I'm embarassed to say I don't know how to reply to your 
response. When I posted my original question, I simply sent email to the 
mailing list. I viewed your response on Nabble, but it doesn't seem to allow me 
to reply. How can I view your response and reply somewhere other than Nabble?

My pleasure

I'm now also embarrassed to say, I have no idea how Nabble works, I simply responded to the mail from the SQLite-users mailing list - sometimes GMail filters my mails, so some people do not see it on the mailing list. I will send this reply to the SQLite-users mailing list for completeness.



To the point, your query to find duplicate records showed there were some.

I removed the duplicates and the SQL as shown in my original post worked. I
don't know what you mean about the insert statement being incorrect. It
explicitly sets all fields in the new table except rowid. My understanding
was that since rowid is INTEGER PRIMARY KEY, inserts to the new table would
get new rowid values, in the order inserted, which is what I want, and it
seems to be doing that correctly.

Yes, perfect - that field set was my mistake, not sue how I looked, but at the moment I was looking I did not notice all the fields were in the prototype for the Insert. It is however and should work perfectly - my apologies.

The query I was using to find duplicates is one that I found on
StackOverflow. I'm very new to SQL. Can you tell me why this query returns
no records while yours does? If I change the last clause to having
count(*)=1, the result has the same number of rows as the table. I thought
that meant it was doing what I wanted, but obviously it doesn't.

select udatetime,device,localtime,code,type,text,0 as status, count(*) as
the_count
from eventlog
group by udatetime,device,code,type,status
having count(*) > 1


This Query is a valiant effort from someone on StackOverflow, but it assumes homogeneous data with respect to character sets encoding and casing - i.e. what qualifies as a Duplicate in constraint terms doesn't qualify as a grouped item necessarily. This is why I volunteered that SQL. To demonstrate, please consider this script:

-- SQLite version 3.9.2 [ Release: 2015-11-02 ] on SQLitespeed version 2.0.2.4.

  -- Script Items: 5          Parameter Count: 0
-- 2016-06-28 23:09:52.846 | [Info] Script Initialized, Started executing... -- ================================================================================================

CREATE TABLE t(A TEXT COLLATE NOCASE, BTEXT COLLATE NOCASE);

INSERT INTO t VALUES
('John','Smith'),
('john','Smith'),
('JOHN','Smith'),
('Johnson','Smithers'),
('Babe','Ruth');


SELECT * FROM t;

  -- A         | B
  -- --------- | ----------
  -- John      | Smith
  -- john      | Smith
  -- JOHN      | Smith
  -- Johnson   | Smithers
  -- Babe      | Ruth


SELECT A, MAX(B) FROM t GROUP BY A ;

  -- A         | MAX(B)
  -- --------- | ----------
  -- Babe      | Ruth
  -- JOHN      | Smith
  -- John      | Smith
  -- Johnson   | Smithers
  -- john      | Smith


SELECT MAX(A), B FROM t GROUP BY B

  -- MAX(A)    | B
  -- --------- | ----------
  -- Babe      | Ruth
  -- john      | Smith
  -- Johnson   | Smithers


And that Table will obviously fail a UNIQUE constraint, but taht is not obvious from the grouping.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to