[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-27 Thread Kirill Müller
On 27.11.2015 10:38, Clemens Ladisch wrote:
> Kirill M?ller wrote:
>> I see no reason why the following two queries can't be executed with the 
>> same plans:
>>
>> ... t1 LEFT JOIN t2 ...
>> ... t1 LEFT JOIN (SELECT * FROM t2) ...
> In this case, the queries are identical.  But SQLite's query optimizer
> does not try to optimize this because such trivial subqueries are
> (thought to be) unlikely to occur in practice.
Thanks. It seems to work better for inner joins, though. The practical 
application is a query generator that relies on the SQL engine to be 
able to optimize this. Is there any chance that SQLite will be able to 
treat such queries more efficiently?


-Kirill


[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-27 Thread Kirill Müller
Exactly. And I'd pretty much like SQLite to figure that out for me ;-)


-Kirill


On 27.11.2015 03:19, Keith Medcalf wrote:
> Would it not be more efficient to say:
>
> select 1 from t1 limit 1;
>
> ?
>
>> -Original Message-
>> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
>> bounces at mailinglists.sqlite.org] On Behalf Of Kirill M?ller
>> Sent: Thursday, 26 November, 2015 15:03
>> To: SQLite mailing list
>> Subject: Re: [sqlite] Query flattening for left joins involving subqueries
>> on the right-hand side
>>
>> On 26.11.2015 21:12, Clemens Ladisch wrote:
>>> Kirill M?ller wrote:
>>>> On 25.11.2015 16:32, Clemens Ladisch wrote:
>>>>> Kirill M?ller wrote:
>>>>>> For a left join with a subquery on the right-hand side, that subquery
>>>>>> doesn't seem to be flattened.
>>>>> This is rule 3 of <http://www.sqlite.org/optoverview.html#flattening>.
>>>> I wonder if this rule might be relaxed a bit.
>>> Only if you relax your requirement that the results must be correct.
>>>
>>>
>>> In the general case, a left outer join can be rewritten like this:
>>>
>>> SELECT ... FROM A JOIN B ON ...
>>> UNION ALL
>>> SELECT ... FROM A WHERE NOT EXISTS (look up in B)
>>>
>>> This query would be more likely to be flattenable, but also be slower.
>>>
>> Thanks. Let's not focus on terminology -- I thought "flattening" was the
>> right word to use, but it probably isn't. Of course I'm looking for
>> correct results.
>>
>> Originally, I attached a script but it seems that it's been stripped.
>> I've pasted it below. I see no reason why the following two queries (1
>> and 3 in the script) can't be executed with the same plans:
>>
>> SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN t2 USING (a) LIMIT 1)
>> SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) zzz2
>> USING (a) LIMIT 1)
>>
>> This is for two tables t1 and t2 with a single column "a". The script
>> creates them and populates them with 20 rows each.
>>
>>
>> -Kirill
>>
>>
>> #!/bin/bash
>>
>> db=test.sqlite
>>
>> #if false; then
>> rm -f $db
>>
>> n=20
>>
>> sqlite3 $db "CREATE TABLE t1 (a int primary key)"
>> seq 1 $n | sqlite3 $db ".import /dev/stdin t1"
>>
>> sqlite3 $db "CREATE TABLE t2 (a int primary key)"
>> seq 1 $n | sqlite3 $db ".import /dev/stdin t2"
>> #fi
>>
>> q() {
>>   sqlite3 $db "EXPLAIN QUERY PLAN $1"
>>   time sqlite3 $db "$1"
>> }
>>
>> q "SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN t2 USING (a) LIMIT 1)"
>> q "SELECT count(*) FROM (SELECT * FROM (SELECT * FROM t1) zzz1 LEFT JOIN
>> t2 USING (a) LIMIT 1)"
>> q "SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2)
>> zzz2 USING (a) LIMIT 1)"
>> q "SELECT count(*) FROM (SELECT * FROM t1 INNER JOIN (SELECT * FROM t2)
>> zzz2 USING (a) LIMIT 1)"
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-27 Thread Clemens Ladisch
Kirill M?ller wrote:
> I see no reason why the following two queries can't be executed with the same 
> plans:
>
> ... t1 LEFT JOIN t2 ...
> ... t1 LEFT JOIN (SELECT * FROM t2) ...

In this case, the queries are identical.  But SQLite's query optimizer
does not try to optimize this because such trivial subqueries are
(thought to be) unlikely to occur in practice.


Regards,
Clemens


[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-26 Thread Kirill Müller
On 26.11.2015 21:12, Clemens Ladisch wrote:
> Kirill M?ller wrote:
>> On 25.11.2015 16:32, Clemens Ladisch wrote:
>>> Kirill M?ller wrote:
 For a left join with a subquery on the right-hand side, that subquery
 doesn't seem to be flattened.
>>> This is rule 3 of .
>> I wonder if this rule might be relaxed a bit.
> Only if you relax your requirement that the results must be correct.
>
>
> In the general case, a left outer join can be rewritten like this:
>
>SELECT ... FROM A JOIN B ON ...
>UNION ALL
>SELECT ... FROM A WHERE NOT EXISTS (look up in B)
>
> This query would be more likely to be flattenable, but also be slower.
>
Thanks. Let's not focus on terminology -- I thought "flattening" was the 
right word to use, but it probably isn't. Of course I'm looking for 
correct results.

Originally, I attached a script but it seems that it's been stripped. 
I've pasted it below. I see no reason why the following two queries (1 
and 3 in the script) can't be executed with the same plans:

SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN t2 USING (a) LIMIT 1)
SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) zzz2 
USING (a) LIMIT 1)

This is for two tables t1 and t2 with a single column "a". The script 
creates them and populates them with 20 rows each.


-Kirill


#!/bin/bash

db=test.sqlite

#if false; then
rm -f $db

n=20

sqlite3 $db "CREATE TABLE t1 (a int primary key)"
seq 1 $n | sqlite3 $db ".import /dev/stdin t1"

sqlite3 $db "CREATE TABLE t2 (a int primary key)"
seq 1 $n | sqlite3 $db ".import /dev/stdin t2"
#fi

q() {
 sqlite3 $db "EXPLAIN QUERY PLAN $1"
 time sqlite3 $db "$1"
}

q "SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN t2 USING (a) LIMIT 1)"
q "SELECT count(*) FROM (SELECT * FROM (SELECT * FROM t1) zzz1 LEFT JOIN 
t2 USING (a) LIMIT 1)"
q "SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) 
zzz2 USING (a) LIMIT 1)"
q "SELECT count(*) FROM (SELECT * FROM t1 INNER JOIN (SELECT * FROM t2) 
zzz2 USING (a) LIMIT 1)"



[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-26 Thread Clemens Ladisch
Kirill M?ller wrote:
> On 25.11.2015 16:32, Clemens Ladisch wrote:
>> Kirill M?ller wrote:
>>> For a left join with a subquery on the right-hand side, that subquery
>>> doesn't seem to be flattened.
>>
>> This is rule 3 of .
>
> I wonder if this rule might be relaxed a bit.

Only if you relax your requirement that the results must be correct.


In the general case, a left outer join can be rewritten like this:

  SELECT ... FROM A JOIN B ON ...
  UNION ALL
  SELECT ... FROM A WHERE NOT EXISTS (look up in B)

This query would be more likely to be flattenable, but also be slower.


Regards,
Clemens


[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-26 Thread Keith Medcalf

Would it not be more efficient to say:

select 1 from t1 limit 1;

?

> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Kirill M?ller
> Sent: Thursday, 26 November, 2015 15:03
> To: SQLite mailing list
> Subject: Re: [sqlite] Query flattening for left joins involving subqueries
> on the right-hand side
> 
> On 26.11.2015 21:12, Clemens Ladisch wrote:
> > Kirill M?ller wrote:
> >> On 25.11.2015 16:32, Clemens Ladisch wrote:
> >>> Kirill M?ller wrote:
> >>>> For a left join with a subquery on the right-hand side, that subquery
> >>>> doesn't seem to be flattened.
> >>> This is rule 3 of <http://www.sqlite.org/optoverview.html#flattening>.
> >> I wonder if this rule might be relaxed a bit.
> > Only if you relax your requirement that the results must be correct.
> >
> >
> > In the general case, a left outer join can be rewritten like this:
> >
> >SELECT ... FROM A JOIN B ON ...
> >UNION ALL
> >SELECT ... FROM A WHERE NOT EXISTS (look up in B)
> >
> > This query would be more likely to be flattenable, but also be slower.
> >
> Thanks. Let's not focus on terminology -- I thought "flattening" was the
> right word to use, but it probably isn't. Of course I'm looking for
> correct results.
> 
> Originally, I attached a script but it seems that it's been stripped.
> I've pasted it below. I see no reason why the following two queries (1
> and 3 in the script) can't be executed with the same plans:
> 
> SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN t2 USING (a) LIMIT 1)
> SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) zzz2
> USING (a) LIMIT 1)
> 
> This is for two tables t1 and t2 with a single column "a". The script
> creates them and populates them with 20 rows each.
> 
> 
> -Kirill
> 
> 
> #!/bin/bash
> 
> db=test.sqlite
> 
> #if false; then
> rm -f $db
> 
> n=20
> 
> sqlite3 $db "CREATE TABLE t1 (a int primary key)"
> seq 1 $n | sqlite3 $db ".import /dev/stdin t1"
> 
> sqlite3 $db "CREATE TABLE t2 (a int primary key)"
> seq 1 $n | sqlite3 $db ".import /dev/stdin t2"
> #fi
> 
> q() {
>  sqlite3 $db "EXPLAIN QUERY PLAN $1"
>  time sqlite3 $db "$1"
> }
> 
> q "SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN t2 USING (a) LIMIT 1)"
> q "SELECT count(*) FROM (SELECT * FROM (SELECT * FROM t1) zzz1 LEFT JOIN
> t2 USING (a) LIMIT 1)"
> q "SELECT count(*) FROM (SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2)
> zzz2 USING (a) LIMIT 1)"
> q "SELECT count(*) FROM (SELECT * FROM t1 INNER JOIN (SELECT * FROM t2)
> zzz2 USING (a) LIMIT 1)"
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users





[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-26 Thread Kirill Müller
On 25.11.2015 16:32, Clemens Ladisch wrote:
> Kirill M?ller wrote:
>> For a left join with a subquery on the right-hand side, that subquery
>> doesn't seem to be flattened.
> This is rule 3 of .
Thanks, missed that. While true, I wonder if this rule might be relaxed 
a bit. The SQL is produced by a very generic SQL query generator, which 
just uses subqueries for everything table-like.


-Kirill


[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-25 Thread Clemens Ladisch
Kirill M?ller wrote:
> For a left join with a subquery on the right-hand side, that subquery
> doesn't seem to be flattened.

This is rule 3 of .


Regards,
Clemens


[sqlite] Query flattening for left joins involving subqueries on the right-hand side

2015-11-24 Thread Kirill Müller
Hi


For a left join with a subquery on the right-hand side, that subquery 
doesn't seem to be flattened. This seems to work well with an inner join.

I have attached a reprex. It creates two tables with $n rows and one ID 
column each (200k is enough to show substantial slowdown), and joins 
them with and without subqueries. The third example seems to create a 
suboptimal query plan and takes much longer than necessary to run. The 
output on my system is below the message.

Thanks for your attention.


Best regards

Kirill



1|0|0|SCAN TABLE t1
0|0|0|SCAN SUBQUERY 1
1

real0m0.003s
user0m0.000s
sys0m0.000s
1|0|0|SCAN TABLE t1
0|0|0|SCAN SUBQUERY 1
1

real0m0.003s
user0m0.000s
sys0m0.000s
2|0|0|SCAN TABLE t2
1|0|0|SCAN TABLE t1
1|1|1|SEARCH SUBQUERY 2 AS zzz2 USING AUTOMATIC COVERING INDEX (a=?)
0|0|0|SCAN SUBQUERY 1
1

real0m0.277s
user0m0.264s
sys0m0.012s
1|0|0|SCAN TABLE t1
1|1|1|SEARCH TABLE t2 USING COVERING INDEX sqlite_autoindex_t2_1 (a=?)
0|0|0|SCAN SUBQUERY 1
1

real0m0.003s
user0m0.000s
sys0m0.000s