[sqlite] Ticket 3128: DISTINCT -> GROUP BY can lead to performance degradation

Bradley A. Town Tue, 20 May 2008 00:11:39 -0700

I've only found one reference to slower queries with the DISTINCT/GROUP 
BY optimization that went in back in November for 3.5.3 and later.  I 
would have expected more given the number of our queries causing problems.


The problem as I wrote in ticket 3128 
<http://www.sqlite.org/cvstrac/tktview?tn=3128> appears to be with 
queries that use DISTINCT and LIMIT.  If the query is somewhat slow (in 
my example, it's joining several large tables), the new DISTINCT 
(implemented in the code as GROUP BY) can be much, much slower because 
it (apparently) collects all or most of the rows before applying GROUP 
BY, even when there's a LIMIT.  Before the change (3.5.2 and earlier), 
DISTINCT was aided by the LIMIT quite a bit.  In my example, the query 
takes around 0.02 seconds in 3.5.2 and earlier, 3.1 seconds in 3.5.3 and 
later.

Has anyone else seen similar behavior?  If so, have you found a 
workaround?  My workaround is actually a patch to the source to disable 
the optimization when a LIMIT is given, though there might be cases 
where this is undesirable (like maybe cases where the table has indices 
that can be used by GROUP BY and the LIMIT is sufficiently high).

Brad Town

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Ticket 3128: DISTINCT -> GROUP BY can lead to performance degradation

Reply via email to