Re: [sqlite] Slow query, with correlated sub-sub-query

Keith Medcalf Fri, 07 Jul 2017 12:57:51 -0700

Well of course.  You are aware that a correlated subquery means "for each 
candidate result execute the query"?


So as you have formulated the query it means:

for each row in a
    compute the result count which
         for each ab candidate row
             calculate whether it is the minimum
    
which means that the you have requested that the same result be computed many 
times over.  You have requested exampination of count(a) * count(ab) * 
count(ab) rows. 

Instead you should be computing the min(size) for each group of a once, and 
using that value in the correlated subquery

select a.a,
       (
           select count(*)
             from ab
            where a == a.a
              and size == b.size
       ) as acount
  from a,
       (
           select a,
                  min(size) as size
             from ab
         group by a
        ) as b
where a.a == b.a;

This will result in scanning count(ab) + count(a) * count(ab) rows.  Which is 
significantly less.  On my computer it reduces the execution time of the 
original query you posited from 400 ticks to less than 1 tick (ie, from 500 ms 
to <8 ms)

I do not know if any optimizer can flatten you original query to any 
significant degree.  Some optimizers may arrive at my fixed up query because 
they are capable of doing a hash table lookup on the result of the innermost 
correlate.  SQLite does not do that, and without that capability I do not think 
there is a relational database query optimizer on the planet that can help you.

-- 
˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı

> -----Original Message-----
> From: sqlite-users [mailto:[email protected]]
> On Behalf Of E.Pasma
> Sent: Friday, 7 July, 2017 07:47
> To: SQLite mailing list
> Subject: [sqlite] Slow query, with correlated sub-sub-query
> 
> Hello, below is a theoretical query that becomes slow when the number
> of rows increases. What it does is:
> - scan input cases in table a
> - for each input case:
> -- determine the smallest value of attribute size of elements in table
> ab
> -- count the number of elements having this smallest size
> With 3 rows in table a and 3*1000 in ab this takes already several
> seconds.
> I'm not so much interested in an alternative solution, though
> interesting, and merely want to show an inefficient construction. That
> is a sub-sub-query correlated directly to the main query.
> Thanks, E. Pasma
> 
> .version
> SQLite 3.19.3 2017-06-08 14:26:17 ...
> 
> create table a (a, primary key (a))
> ;
> create table ab (a, b, size, primary key (a,b))
> ;
> insert into a
> with i as (select 1 as i union all select i+1 from i where i<3)
> select i from i
> ;
> insert into ab
> with i as (select 1 as i union all select i+1 from i where i<1000)
> select a, i as b, random()%10 as size from a, i
> ;
> .eqp on
> .timer on
> select  a,
>         (
>             select  count(*)
>             from    ab
>             where   a=a.a
>             and     size=(select min(size) from ab where a=a.a)
>         )
> from    a
> ;
> --EQP-- 0,0,0,SCAN TABLE a
> --EQP-- 0,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 1
> --EQP-- 1,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?)
> --EQP-- 1,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 2
> --EQP-- 2,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?)
> 1|56
> 2|53
> 3|49
> Run Time: real 2.678 user 2.597794 sys 0.008801
> 
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Slow query, with correlated sub-sub-query

Reply via email to