Re: [sqlite] Returning NULL or empty values when the SELECT does not satisfy all of the query

2019-04-06 Thread James K. Lowden
On Fri, 5 Apr 2019 18:54:18 +
Jose Isaias Cabrera  wrote:

> Why does this work

I don't know what "work" means, but I can explain the difference.  With
an outer join, JOIN and WHERE are not the same.  In analyzing the
query, we consider JOIN before WHERE.  

> select
>  a.*, b.* from t as a LEFT JOIN z as b on a.a = b.f
> AND
>   a.idate = (select max(idate) from t where a = a.a)
> AND
>   b.idate = (select max(idate) from z where f = a.a)
> where a.a = 'p006'

In an outer join, the outer table -- think "outer" as Venn diagram --
is the "preserved" table.  All rows match, join criteria
notwithstanding. This component has no effect: 

>   a.idate = (select max(idate) from t where a = a.a)

because "a" is the outer table, and so all rows match, join criteria
notwithstanding.  

The inner table is a little different, and also different from WHERE in
an inner join.  On the inner table, JOIN restrictions can cause a row
not to match that otherwise would.  In your query: 

>   b.idate = (select max(idate) from z where f = a.a)

"b" rows that don't pass that test are rejected from the join.  The "a"
columns will appear (because all "a" rows do) and the "b" columns will
be NULL.  

Only once the join is completed do we consider WHERE: 

> where a.a = 'p006'

This restricts rows in the outer table.  

Now let's look at your other query.  

> select
>  a.*, b.* from t as a LEFT JOIN z as b on a.a = b.f

The *only* join criterion is a.a = b.f.  All rows passing that test are
subjected to WHERE: all "a" rows (because outer table) and "b"
rows with a matching "f".  

> where a.a = 'p006'

OK, same as #1.  

> AND
>   a.idate = (select max(idate) from t where a = a.a)

This further restricts the "a" table rows, unlike #1. 

> AND
>   b.idate = (select max(idate) from z where f = a.a)

This restricts produced rows to those passing the test.  Different from
#1, no row passes this test for which b.idate is NULL.  

Whenever you disallow NULL on the inner table of an outer join, you
effectively convert the join from outer to inner.  You're asking for:

1.  all rows in "a", whether or not they match a "b" row, provided
2.  they do match a "b" row (because b.idate cannot be NULL)

You can vote for anyone in either party from this list of Democrats.  

Because both AND clauses restrict the output, we can expect the 2nd
query to produce fewer rows.  If it doesn't, there's no need for an
outer query for the data in their current state.  

By the way, the distinction of JOIN and WHERE is not a relational
concept.  It was added to SQL-92 in part to provide exactly the
separation your question illustrates: how to express an outer join.  

Relationally, that's not needed.  An outer join is nothing but an inner
join + a union with the outer table.  That's cumbersome to express
in SQL, and anything cumbersome to express is hard to optimize, and
query optimization was and is the unsolved technical challenge of SQL.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Reporting two more interesting queries with performance regression

2019-04-06 Thread Jinho Jung
Hello,

We are submitting the third report with two more interesting cases with
bisecting result using "SQL-Perf-Fuzzer".

This is the timeline of our activities:

  [Apr 1, 2019] Submitted 1st report (3 queries, 1 bisected commits)
  (
http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083863.html
)
==> confirmed the problem with correct bisect result (Apr/1)

  [Apr 2, 2019] Submitted 2nd report: 2 queries, 2 unique bisected commits
  (
http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083868.html
)
==> not yet confirmed

  [Apr 5, 2019] Submitted 3rd report (2 queries, 2 unique bisected commits)
==> this report


Here are the steps for reproducing our observations. All steps are same
except for the link to download new test-cases:

[Our test environment]
* Ubuntu 16.04 machine "Linux sludge 4.4.0-116-generic #140-Ubuntu SMP Mon
Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux"
* Database: TPC-C benchmark

[Setup Test Environment]

1. build SQLite 3.27.2 (verion of Feb 2019)
  $ wget https://www.sqlite.org/2019/sqlite-src-3270200.zip
  $ unzip sqlite-src-3270200.zip
  $ mv sqlite-src-3270200 sqlite327
  $ cd sqlite327
  $ ./configure
  $ make
  $ cd ..

2. build SQLite 3.23.0 (verion of Apr 2018)
  $ wget https://www.sqlite.org/2018/sqlite-src-323.zip
  $ unzip sqlite-src-323.zip
  $ mv sqlite-src-323 sqlite323
  $ cd sqlite323
  $ ./configure
  $ make
  $ cd ..

3. download tpc-c for sqlite3 (scale-factor of 1)
  $ mkdir testcase
  $ cd testcase

  $ wget https://gts3.org/~/jjung/sqlite/tpcc_sqlite.tar.gz
  $ tar xzvf tpcc_sqlite.tar.gz

; download regression queries
  $ wget https://gts3.org/~/jjung/sqlite/report3.tar.gz
  $ tar xzvf report1.tar.gz
  $ cd ..

4. launch two SQLites
  - start
$ sqlite327/sqlite3 testcase/test.db
$ sqlite323/sqlite3 testcase/test.db

  - for each DB, set up timer
sqlite> .timer on

 - copy and paste extracted queries


Here’s the time taken to execute four SQL queries on old (v3.23) and newer
version (v3.27.2) of SQLite (in milliseconds). We also try bisecting to
know which commit activate the regression.

+--++
| Query|   Time |
+--++
| 10002.sql (v3.23)|789 |
| 10002.sql (v3.27.2)  |   1270 |
+--++
| 10052.sql (v3.23)|   3094 |
| 10052.sql (v3.27.2)  |   4478 |
+--++

1) 10002.sql shows 60% performance regression
 - bisect fossil commit:
  === 2018-12-31 ===
  [9fb646f29c] *MERGE* Merge enhancements and bug fixes from trunk. (user:
drh tags: reuse-
   subqueries)

2) 10052.sql shows 40% performance regression
 - bisect fossil commit:
  === 2018-12-24 ===
  [7153552bac] Improvements to EXPLAIN QUERY PLAN formatting. The
MULTI-INDEX OR now
   shows a separate "INDEX" subtree for each index. SCALAR SUBQUERY entries
provide a
   subquery number that is related back to the .selecttrace output. (user:
drh tags: reuse-
   subqueries)

Thanks for your checking and support.

Best regards,
Jinho Jung
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users