Jamie Norrish wrote:
> Is there any optimisation done for the case of a symmetric EXCEPT, to
> avoid performing duplicate SELECT statements?
Some subqueries can be flattened by SQLite, but otherwise, subqueries
aren't really optimized at all.
> (SELECT A... EXCEPT SELECT B...) UNION (SELECT B... EXCEPT SELECT A...)
>
> would seem to be a poor approach to getting the set of rows in either A
> or B. Are there potentially better generic approaches, or other tricks
> I've missed?
The description "the set of rows in A or B, except those that are in
both A and B" leads to
(SELECT A... UNION SELECT B...) EXCEPT (SELECT A... INTERSECT SELECT B...)
There doesn't seem to be much of a difference in practice; performance
probably depends on the size of the subquery results:
sqlite> explain query plan select * from a except select * from b union all
select * from (select * from b except select * from a);
sele order from deta
---- ------------- ---- ----
2 0 0 SCAN TABLE a (~1000000 rows)
3 0 0 SCAN TABLE b (~1000000 rows)
1 0 0 COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE
(EXCEPT)
6 0 0 SCAN TABLE b (~1000000 rows)
7 0 0 SCAN TABLE a (~1000000 rows)
5 0 0 COMPOUND SUBQUERIES 6 AND 7 USING TEMP B-TREE
(EXCEPT)
4 0 0 SCAN SUBQUERY 5 (~1000000 rows)
0 0 0 COMPOUND SUBQUERIES 1 AND 4 (UNION ALL)
sqlite> explain query plan select * from a union all select * from b except
select * from (select * from a intersect select * from b);
sele order from deta
---- ------------- ---- ----
2 0 0 SCAN TABLE a (~1000000 rows)
3 0 0 SCAN TABLE b (~1000000 rows)
1 0 0 COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
6 0 0 SCAN TABLE a (~1000000 rows)
7 0 0 SCAN TABLE b (~1000000 rows)
5 0 0 COMPOUND SUBQUERIES 6 AND 7 USING TEMP B-TREE
(INTERSECT)
4 0 0 SCAN SUBQUERY 5 (~1000000 rows)
0 0 0 COMPOUND SUBQUERIES 1 AND 4 USING TEMP B-TREE
(EXCEPT)
If the rows can be represented by some key column(s), the lookups could
be implemented with indexes that could be created in advance, instead of
with temporary B-trees:
sqlite> explain query plan select * from a where not exists (select 1 from b
where b.id = a.id)
union all select * from b where not exists (select 1 from a
where a.id = b.id);
sele order from deta
---- ------------- ---- ----
1 0 0 SCAN TABLE a (~500000 rows)
1 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 2
2 0 0 SEARCH TABLE b USING AUTOMATIC COVERING INDEX (id=?)
(~7 rows)
3 0 0 SCAN TABLE b (~500000 rows)
3 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 4
4 0 0 SEARCH TABLE a USING AUTOMATIC COVERING INDEX (id=?)
(~7 rows)
0 0 0 COMPOUND SUBQUERIES 1 AND 3 (UNION ALL)
Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users