>>>>> Jim Morris writes: >>>>> On 10/6/2011 10:43 PM, Ivan Shmakov wrote: >>>>> Jim Morris writes:
[…] >>> INSERT INTO fts3_table (a,b,c) >>> SELECT 'an A','a B','a C' >>> WHERE NOT EXISTS >>> (SELECT DISTINCT a,b,c >>> FROM fts3_table >>> WHERE a='an A' AND b='a B' AND c='a C'); >> However, I wonder, would the following (slightly more concise) query >> imply any performance loss in comparison to the one above? >> INSERT INTO fts3_table (a, b, c) >> SELECT 'an A', 'a B', 'a C' >> EXCEPT SELECT DISTINCT a, b, c >> FROM fts3_table; >> Also, I'm curious if DISTINCT may cause any performance loss in the >> case that the columns in question are constrained by an UNIQUE >> index? Like: >> CREATE UNIQUE INDEX "foo-unique" >> ON "foo" (a, b, c); > I don't know for sure. You would need to do some testing to > determine performance issues. I wouldn't use distinct if the values > are already guaranteed to be unique As per EXPLAIN, SELECT DISTINCT indeed doesn't take the uniqueness constraint into consideration. > or in a not exits clause since it may have additional overhead. Surprisingly enough, when used in the EXCEPT part of the query, DISTINCT has no effect on the VM code (as shown by EXPLAIN)! At a first glance, the code for the NOT EXISTS variant (sans INSERT; hopefully it doesn't matter) seems simpler (one operation less in total, one less Next, has no OpenEphemeral's, and IdxGE instead of IdxInsert and IdxDelete): addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Trace 0 0 0 00 1 Goto 0 21 0 00 2 OpenRead 0 2 1 1 00 3 Rewind 0 19 0 00 4 Integer 0 2 0 00 5 Integer 1 3 0 00 6 OpenRead 2 3 0 keyinfo(1,BINARY) 00 7 Column 0 0 4 00 8 IsNull 4 14 0 00 9 SeekGe 2 14 4 1 00 10 IdxGE 2 14 4 1 01 11 Integer 1 2 0 00 12 IfZero 3 14 -1 00 13 Next 2 10 0 00 14 Close 2 0 0 00 15 If 2 18 1 00 16 Column 0 0 7 00 17 ResultRow 7 1 0 00 18 Next 0 4 0 01 19 Close 0 0 0 00 20 Halt 0 0 0 00 21 Transaction 0 0 0 00 22 VerifyCookie 0 2 0 00 23 Transaction 1 0 0 00 24 VerifyCookie 1 1 0 00 25 TableLock 1 2 0 temp.staged 00 26 TableLock 0 2 0 foo 00 27 Goto 0 2 0 00 The code for the EXCEPT variant is: addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Trace 0 0 0 00 1 OpenEphemeral 2 1 0 keyinfo(1,BINARY) 00 2 Goto 0 22 0 00 3 OpenRead 1 2 1 1 00 4 Rewind 1 9 0 00 5 Column 1 0 1 00 6 MakeRecord 1 1 2 00 7 IdxInsert 2 2 0 00 8 Next 1 5 0 01 9 Close 1 0 0 00 10 OpenRead 0 2 0 2 00 11 Rewind 0 15 0 00 12 Column 0 1 1 00 13 IdxDelete 2 1 1 00 14 Next 0 12 0 01 15 Close 0 0 0 00 16 Rewind 2 20 0 00 17 Column 2 0 3 00 18 ResultRow 3 1 0 00 19 Next 2 17 0 00 20 Close 2 0 0 00 21 Halt 0 0 0 00 22 Transaction 0 0 0 00 23 VerifyCookie 0 2 0 00 24 Transaction 1 0 0 00 25 VerifyCookie 1 1 0 00 26 TableLock 1 2 0 temp.staged 00 27 TableLock 0 2 0 foo 00 28 Goto 0 3 0 00 The SQL code used for testing is as follows. BEGIN; CREATE TABLE "foo" ( "id" INTEGER PRIMARY KEY, "bar" TEXT NOT NULL); CREATE UNIQUE INDEX "foo-unique" ON "foo" ("bar"); INSERT INTO "foo" ("bar") VALUES ('one'); INSERT INTO "foo" ("bar") VALUES ('two'); INSERT INTO "foo" ("bar") VALUES ('three'); CREATE TEMP TABLE "temp.staged" ("bar" TEXT NOT NULL); INSERT INTO "temp.staged" ("bar") VALUES ('one'); INSERT INTO "temp.staged" ("bar") VALUES ('four'); .explain ON EXPLAIN SELECT DISTINCT "bar" FROM "foo"; EXPLAIN SELECT "bar" FROM "foo"; -- INSERT INTO "foo" ("bar") EXPLAIN SELECT "bar" FROM "temp.staged" EXCEPT SELECT DISTINCT "bar" FROM "foo"; -- produces exactly the same output as the one above: EXPLAIN SELECT "bar" FROM "temp.staged" EXCEPT SELECT "bar" FROM "foo"; EXPLAIN SELECT s."bar" FROM "temp.staged" s WHERE NOT EXISTS (SELECT 1 FROM "foo" f WHERE f."bar" = s."bar"); -- SELECT * -- FROM "foo"; END; PS. As of SQLite version 3.7.3 debian 1. -- FSF associate member #7257 _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users