Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-11 Thread J. King
Doh. I meant the linked document on the error log. Silly me. 



On September 11, 2017 9:41:39 PM EDT, "J. King"  wrote:
>There's an extra word in the first paragraph of Section 4 of that
>document, by the way:
>
>" The error logger callback has also proven useful in catching errors
>occasional errors that the application misses..."
>
>On September 11, 2017 11:22:50 AM EDT, Dan Kennedy
> wrote:
>>On 09/10/2017 08:30 PM, R Smith wrote:
>>> Well yes but the documentation suggests that one could expect a
>>slight 
>>> degradation. The words "works best with" does not seem to imbue an 
>>> idea of "give WITHOUT ROWID tables a wide berth when your tables are
>
>>> more than few columns wide", and I don't think the Devs intended
>that
>>
>>> either.
>>>
>>> I can still roll with the idea that the WITHOUT ROWID tables with
>151
>>
>>> columns and lots of data actually performed FASTER than the same
>>ROWID 
>>> tables for the other queries but slower with the JOIN... but that
>>much 
>>> slower?
>>
>>
>>
>>The difference seems to be that, currently, SQLite never creates 
>>automatic indexes on WITHOUT ROWID tables.
>>
>>   http://sqlite.org/optoverview.html#autoindex
>>
>>I don't think there is a good reason that it cannot do so. It just 
>>doesn't yet.
>>
>>Dan.
>>
>>
>>
>>
>>
>>>
>>> I mean we are not talking a "little" slower, we are talking 50ms vs.
>
>>> ~70,000ms, that's a difference of a whopping 3 orders of magnitude
>>and 
>>> change. And we are not talking huge tables, another query that
>simply
>>
>>> walks the tables can do so several million times in the same time
>the
>>
>>> JOIN query does.
>>>
>>> Put another way, I can create 100 new tables and populate them each 
>>> with the rows from one test table, then delete the rows from each
>new
>>
>>> table that doesn't satisfy an EXISTS() check in the other test table
>
>>> (essentially mimicking the JOIN query) and then output each full new
>
>>> table, 100 of them in turn, and then DROP them all. SQlite can do
>ALL
>>
>>> of that in a fraction of the time that the normal JOIN query
>(between
>>
>>> those same two test tables) takes to complete.
>>>
>>> We are talking a formula 1 car suddenly going at max speed of 1 
>>> mile-per-weekend, barely keeping up with a semi-athletic snail, and 
>>> people suggest checking the fuel octane rating. I'm saying there is 
>>> something wrong under the hood.
>>>
>>> Cheers,
>>> Ryan
>>>
>>>
>>> On 2017/09/10 11:28 AM, Clemens Ladisch wrote:
 R Smith wrote:
> I am using 151 columns for both tests. The only thing that changes
> between the two scripts are the words "WITHOUT ROWID" being added
  says:
 | WITHOUT ROWID tables will work correctly ... for tables with a
>>single
 | INTEGER PRIMARY KEY. However, ordinary rowid tables will run
>>faster in
 | that case.
 and:
 | WITHOUT ROWID tables work best when individual rows are not too
>>large.

 So this is pretty much the documented worst case for WITHOUT ROWID 
 tables.

 If the query is executed by making a copy of all the table data
>into
 a temporary B-tree (index), which is what INTERSECT and probably
>>IN(...)
 are doing, then the WITHOUT-ROWID-ness of the table does not
>matter.


 Regards,
 Clemens
 ___
 sqlite-users mailing list
 sqlite-users@mailinglists.sqlite.org

>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>>___
>>sqlite-users mailing list
>>sqlite-users@mailinglists.sqlite.org
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>-- 
>Sent from my Android device with K-9 Mail. Please excuse my brevity.
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-11 Thread J. King
There's an extra word in the first paragraph of Section 4 of that document, by 
the way:

" The error logger callback has also proven useful in catching errors 
occasional errors that the application misses..."

On September 11, 2017 11:22:50 AM EDT, Dan Kennedy  
wrote:
>On 09/10/2017 08:30 PM, R Smith wrote:
>> Well yes but the documentation suggests that one could expect a
>slight 
>> degradation. The words "works best with" does not seem to imbue an 
>> idea of "give WITHOUT ROWID tables a wide berth when your tables are 
>> more than few columns wide", and I don't think the Devs intended that
>
>> either.
>>
>> I can still roll with the idea that the WITHOUT ROWID tables with 151
>
>> columns and lots of data actually performed FASTER than the same
>ROWID 
>> tables for the other queries but slower with the JOIN... but that
>much 
>> slower?
>
>
>
>The difference seems to be that, currently, SQLite never creates 
>automatic indexes on WITHOUT ROWID tables.
>
>   http://sqlite.org/optoverview.html#autoindex
>
>I don't think there is a good reason that it cannot do so. It just 
>doesn't yet.
>
>Dan.
>
>
>
>
>
>>
>> I mean we are not talking a "little" slower, we are talking 50ms vs. 
>> ~70,000ms, that's a difference of a whopping 3 orders of magnitude
>and 
>> change. And we are not talking huge tables, another query that simply
>
>> walks the tables can do so several million times in the same time the
>
>> JOIN query does.
>>
>> Put another way, I can create 100 new tables and populate them each 
>> with the rows from one test table, then delete the rows from each new
>
>> table that doesn't satisfy an EXISTS() check in the other test table 
>> (essentially mimicking the JOIN query) and then output each full new 
>> table, 100 of them in turn, and then DROP them all. SQlite can do ALL
>
>> of that in a fraction of the time that the normal JOIN query (between
>
>> those same two test tables) takes to complete.
>>
>> We are talking a formula 1 car suddenly going at max speed of 1 
>> mile-per-weekend, barely keeping up with a semi-athletic snail, and 
>> people suggest checking the fuel octane rating. I'm saying there is 
>> something wrong under the hood.
>>
>> Cheers,
>> Ryan
>>
>>
>> On 2017/09/10 11:28 AM, Clemens Ladisch wrote:
>>> R Smith wrote:
 I am using 151 columns for both tests. The only thing that changes
 between the two scripts are the words "WITHOUT ROWID" being added
>>>  says:
>>> | WITHOUT ROWID tables will work correctly ... for tables with a
>single
>>> | INTEGER PRIMARY KEY. However, ordinary rowid tables will run
>faster in
>>> | that case.
>>> and:
>>> | WITHOUT ROWID tables work best when individual rows are not too
>large.
>>>
>>> So this is pretty much the documented worst case for WITHOUT ROWID 
>>> tables.
>>>
>>> If the query is executed by making a copy of all the table data into
>>> a temporary B-tree (index), which is what INTERSECT and probably
>IN(...)
>>> are doing, then the WITHOUT-ROWID-ness of the table does not matter.
>>>
>>>
>>> Regards,
>>> Clemens
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-11 Thread Dan Kennedy

On 09/10/2017 08:30 PM, R Smith wrote:
Well yes but the documentation suggests that one could expect a slight 
degradation. The words "works best with" does not seem to imbue an 
idea of "give WITHOUT ROWID tables a wide berth when your tables are 
more than few columns wide", and I don't think the Devs intended that 
either.


I can still roll with the idea that the WITHOUT ROWID tables with 151 
columns and lots of data actually performed FASTER than the same ROWID 
tables for the other queries but slower with the JOIN... but that much 
slower?




The difference seems to be that, currently, SQLite never creates 
automatic indexes on WITHOUT ROWID tables.


  http://sqlite.org/optoverview.html#autoindex

I don't think there is a good reason that it cannot do so. It just 
doesn't yet.


Dan.







I mean we are not talking a "little" slower, we are talking 50ms vs. 
~70,000ms, that's a difference of a whopping 3 orders of magnitude and 
change. And we are not talking huge tables, another query that simply 
walks the tables can do so several million times in the same time the 
JOIN query does.


Put another way, I can create 100 new tables and populate them each 
with the rows from one test table, then delete the rows from each new 
table that doesn't satisfy an EXISTS() check in the other test table 
(essentially mimicking the JOIN query) and then output each full new 
table, 100 of them in turn, and then DROP them all. SQlite can do ALL 
of that in a fraction of the time that the normal JOIN query (between 
those same two test tables) takes to complete.


We are talking a formula 1 car suddenly going at max speed of 1 
mile-per-weekend, barely keeping up with a semi-athletic snail, and 
people suggest checking the fuel octane rating. I'm saying there is 
something wrong under the hood.


Cheers,
Ryan


On 2017/09/10 11:28 AM, Clemens Ladisch wrote:

R Smith wrote:

I am using 151 columns for both tests. The only thing that changes
between the two scripts are the words "WITHOUT ROWID" being added

 says:
| WITHOUT ROWID tables will work correctly ... for tables with a single
| INTEGER PRIMARY KEY. However, ordinary rowid tables will run faster in
| that case.
and:
| WITHOUT ROWID tables work best when individual rows are not too large.

So this is pretty much the documented worst case for WITHOUT ROWID 
tables.


If the query is executed by making a copy of all the table data into
a temporary B-tree (index), which is what INTERSECT and probably IN(...)
are doing, then the WITHOUT-ROWID-ness of the table does not matter.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-10 Thread R Smith
Well yes but the documentation suggests that one could expect a slight 
degradation. The words "works best with" does not seem to imbue an idea 
of "give WITHOUT ROWID tables a wide berth when your tables are more 
than few columns wide", and I don't think the Devs intended that either.


I can still roll with the idea that the WITHOUT ROWID tables with 151 
columns and lots of data actually performed FASTER than the same ROWID 
tables for the other queries but slower with the JOIN... but that much 
slower?


I mean we are not talking a "little" slower, we are talking 50ms vs. 
~70,000ms, that's a difference of a whopping 3 orders of magnitude and 
change. And we are not talking huge tables, another query that simply 
walks the tables can do so several million times in the same time the 
JOIN query does.


Put another way, I can create 100 new tables and populate them each with 
the rows from one test table, then delete the rows from each new table 
that doesn't satisfy an EXISTS() check in the other test table 
(essentially mimicking the JOIN query) and then output each full new 
table, 100 of them in turn, and then DROP them all. SQlite can do ALL of 
that in a fraction of the time that the normal JOIN query (between those 
same two test tables) takes to complete.


We are talking a formula 1 car suddenly going at max speed of 1 
mile-per-weekend, barely keeping up with a semi-athletic snail, and 
people suggest checking the fuel octane rating. I'm saying there is 
something wrong under the hood.


Cheers,
Ryan


On 2017/09/10 11:28 AM, Clemens Ladisch wrote:

R Smith wrote:

I am using 151 columns for both tests. The only thing that changes
between the two scripts are the words "WITHOUT ROWID" being added

 says:
| WITHOUT ROWID tables will work correctly ... for tables with a single
| INTEGER PRIMARY KEY. However, ordinary rowid tables will run faster in
| that case.
and:
| WITHOUT ROWID tables work best when individual rows are not too large.

So this is pretty much the documented worst case for WITHOUT ROWID tables.

If the query is executed by making a copy of all the table data into
a temporary B-tree (index), which is what INTERSECT and probably IN(...)
are doing, then the WITHOUT-ROWID-ness of the table does not matter.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-10 Thread Clemens Ladisch
R Smith wrote:
> I am using 151 columns for both tests. The only thing that changes
> between the two scripts are the words "WITHOUT ROWID" being added

 says:
| WITHOUT ROWID tables will work correctly ... for tables with a single
| INTEGER PRIMARY KEY. However, ordinary rowid tables will run faster in
| that case.
and:
| WITHOUT ROWID tables work best when individual rows are not too large.

So this is pretty much the documented worst case for WITHOUT ROWID tables.

If the query is executed by making a copy of all the table data into
a temporary B-tree (index), which is what INTERSECT and probably IN(...)
are doing, then the WITHOUT-ROWID-ness of the table does not matter.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-09 Thread Nico Williams
On Sat, Sep 09, 2017 at 11:26:35PM +0200, R Smith wrote:
> I think you are missing something or my explanation was not clear.
> When I say "first test" I mean of THIS test suite, not the previous set from
> 3 days ago.

I meant the opposite.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-09 Thread R Smith



On 2017/09/09 9:20 PM, Nico Williams wrote:

On Sat, Sep 09, 2017 at 05:56:03PM +0200, R Smith wrote:

*Findings* pertaining to JOIN, INTERSECT and WHERE IN (...) type queries on
tables *WITHOUT Row_ids*:
(This is the full test posted below because it is the one that matters most)
INTERSECT AND WHERE IN (...) queries posted similar times here with WHERE IN
(...) being the slightly faster (similar to the above findings), but BOTH
were a good 10 - 15% faster than on the ROW-ID tables, so WITHOUT ROWID
tables seem to have a definite advantage here (it is currently unclear to me
why this is).

A troubling test is the JOIN on WITHOUT ROWID tables - it took several
orders of magnitude longer than any other test in the entire experiment.

In your first test you were ordering by PK, now you're not, and you
don't have an [covering] index on the columns you're ordering by, so,
yeah, "orders of magnitude" slower is to be expected.  You're comparing
apples and oranges. /// etc 


I think you are missing something or my explanation was not clear.
When I say "first test" I mean of THIS test suite, not the previous set 
from 3 days ago.


I am in no way ordering by PK in this set of tests, anywhere. I am not 
using 1 column, I am using 151 columns for both tests. The only thing 
that changes between the two scripts are the words "WITHOUT ROWID" being 
added or removed, NOTHING else. The problem also happens on the 
un-ordered tests. The ordering is of no consequence.


To be clear, there are two tests, both do not order by PK, both do not 
have covering indexes, both have 151 columns and lots of data... the 
only difference between them is the words "WITHOUT ROWID" at the end of 
the test tables. One query runs in ~50 milliseconds, the other runs for 
minutes. The results are exactly the same.


You can test this by taking the script I provided, run it, it will run 
for longer than 2 minutes (depending on your machine speed) because of 
the JOIN queries taking over a minute each, then remove the words 
"WITHOUT ROWID" from the two test tables and run it again. The entire 
script will finish in under  3 seconds with the JOIN queries weighing in 
at circa 50ms each.


That cannot possibly be an expected circumstance and it isn't an 
apples-oranges thing.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-09 Thread Nico Williams
On Sat, Sep 09, 2017 at 05:56:03PM +0200, R Smith wrote:
> *Findings* pertaining to JOIN, INTERSECT and WHERE IN (...) type queries on
> tables *WITHOUT Row_ids*:
> (This is the full test posted below because it is the one that matters most)
> INTERSECT AND WHERE IN (...) queries posted similar times here with WHERE IN
> (...) being the slightly faster (similar to the above findings), but BOTH
> were a good 10 - 15% faster than on the ROW-ID tables, so WITHOUT ROWID
> tables seem to have a definite advantage here (it is currently unclear to me
> why this is).
> 
> A troubling test is the JOIN on WITHOUT ROWID tables - it took several
> orders of magnitude longer than any other test in the entire experiment.

In your first test you were ordering by PK, now you're not, and you
don't have an [covering] index on the columns you're ordering by, so,
yeah, "orders of magnitude" slower is to be expected.  You're comparing
apples and oranges.

Also, in your first test you have one column.  Now you have lots.
That's no way to isolate the performance numbers you're trying to get.

The only way to have an ORDER BY speed up a query is when there's an
index on the columns in question that can be scanned to produce the
order you're looking for without additional sorting, or when the engine
can create such an index as a temporary index (this, of course, has a
cost, so it won't always work well)

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-09 Thread R Smith
Full tests completed with findings ranging from less interesting to 
exposing a rather significant inefficiency in SQLite.


I won't post all the tests because that would take far too much space, 
in stead I will simply discuss the experiment and findings and post the 
test script so that anyone can verify the findings.



*Problem*:
Earlier tests suggested that, in a query where the common rows between 
two tables need to be found, JOIN is faster than INTERSECT when there 
are very few matches with efficiencies evening out as the number of 
matches increase. It suggested, more pertinently, that an ORDER BY 
clause when added to both types of queries improved all-round 
performance by a significant margin (moreso in JOIN's case).
After examination of the tests, comments suggested that a lot of the 
efficiency differences may be due to:

- row_id involvement,
- checks only done on Primary key values as opposed to common row 
values, and
- test tables having one or two columns only not matching any real-world 
scenario which made the findings irrelevant.
It was also pointed out that apart from JOIN and INTERSECT, a WHERE IN 
(SELECT FROM ...) clause could achieve the same query result set and 
should form part of the tests.



*Purpose*:
This experiment and tests aimed to establish the differences in 
efficiency for queries with the same results that could be posed in 
different ways to the Query Planner in specifically SQLite.
The possible ways to ask the same question that was tested included 
similar queries (producing the same result set) when finding the common 
set of rows between two tables using JOIN, INTERSECT and WHERE IN (...), 
and secondly similar queries finding the exclusive set of rows between 
two tables using OUTER JOIN, EXCEPT and WHERE NOT IN (...).
[SQLite doesn't implement a full outer join so that specific test was 
restricted to comparing EXCEPT vs. WHERE NOT IN (...)]



*Method*:
A simple SQL script that first establishes a table "p" with parameters 
that dictate the data shape added to the tests. Next a random value 
table "rnd" is set up with the number of rows and randomness scope 
dictated by table "p" and then two test tables "t1" and "t2" with each 
151 columns (1 Key and 30 each of 5 data types - INT, REAL, NUMERIC 
(Dates), BLOB and TEXT) which gets populated with randomly selected rows 
from the "rnd" table so that many rows appear only in either table, and 
some rows appear in both (coincidence).


The total number of rows to  choose from is dictated by the "p" table's 
"rndCount" field and the approximate percentage of rows that appear in 
both tables is dictated by the "rndAddFreq" (random row adding 
frequency) field.
Up to here the tests are all similar, except that for the EXCEPT / JOIN 
WHERE NOT IN (...) a much higher frequency of coincidence between the 
table rows were selected since the output count is opposite to the other 
tests.
The queries designed to produce the exact same output using the 
different methods discussed above are run in turn. The queries are then 
repeated with an added ORDER BY clause.

Timings are measured for all.


*Test parameters* to ensure all potential problems mentioned above were 
accounted for:
- Pre-setup of the random table ensures all the queries operate on the 
exact same randomized data.
- Tables contain 150 data columns of all typical use types and with 
varying data and data lengths (in the case of BLOB and TEXT at least)

- Tables have a Primary Key, but it does not alias the Row_ID
- Tests were repeated for both normal and WITHOUT ROWID tables
- Query field comparisons involved a full set of field types (except 
BLOB columns) and not the primary key
- Test-bed using older/slower notebook and sqlitespeed on Windows 10 
64bit prof with updated latest sqlite3 DLL file - version 3.20.1 Release 
2017-08-24.



*General Notes*:
- Queries were tested in different orders to avoid caching effects, but 
this made no difference - it seems there is no mystery caching when the 
queries are so different.
- The EXCEPT vs. WHERE NOT IN (...) queries yielded consistently similar 
times through a range of parameters, with or without ORDER BY clauses, 
with or without ROWIDs, so their tests provided no real insight, is 
probably both well-optimized already, and I won't refer to them again in 
this document.



*Findings* pertaining to JOIN, INTERSECT and WHERE IN (...) type queries 
on tables *with Row_ids*:
First important thing to note is that when not involving the primary key 
and using these larger (wider) tables, the ORDER BY clause addition made 
very little difference (sometimes completely zero difference). JOIN and 
INTERSECT posted similar times for these larger (wider) tables with 
INTERSECT consistently winning by a very slight margin.
The surprise here was that WHERE IN (...) queries were consistently 
faster than both JOIN and INTERSECT (again with the added ORDER BY being 
inconsequential). Not by a large margin though, 4-5% 

Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread Keith Medcalf

>Other "am I right in thinking this" question: INTERSECT is only going
>to be viable when comparing full records, correct? If you're looking
>to filter table A by whether its primary key is also a primary key
>for table B, but ignoring the other fields in both, then INTERSECT
>becomes not an option, or at least starts making the query more
>complex/ugly... correct?

No.  The data must merely be the same shape (order, number of columns):

SELECT c1, c2, c3, c8 from t1
INTERSECT
SELECT x5, g4, q7, b3 from t2;

each of t1 and t2 may have fields named a1 a2 a4 ... a26 b1 b2 b3 ... b26 ... 
z1 z2 z3 ... z26 (for a total of 676 fields per table).

The comparison is of resulting matrices, not underlying tables.  Of course, if 
you do "SELECT * from table1;" the * is merely "syntactic sugar" (a short way 
of spelling) the complete, in declaration order, list of explicit columns.  
(So, if and only if the ROWID is an "explicit column" is it used in the 
intersection, otherwise it is not -- and concomitantly if you explicitly list 
the columns to intersect, then the rowid participates if and only if you have 
included it in the list of data to intersect.)

INTERSECT / EXCEPT / UNION are matrix operations.  The RHS (select statement 
before the operator) and LHS (select statement after the operator) provide the 
two matrices on which the operation is performed.  Matrixes do not have column 
names, merely ordinal positions (column 1, column 2, column 3 and so forth).  
Similarly these operations do not care about column names, merely that the 
order (number of columns in each matrix) is the same.  Comparisons are done by 
ordinal position of the item in the row.  The output is a matrix.  It only 
appears "row at a time" because of the primitive method of operation of (no 
matter how advanced) computers which can only perform operations as a serial 
sequence of steps.  For "user convenience" the output column names are set to 
the RHS column names.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of David Raymond
>Sent: Thursday, 7 September, 2017 10:31
>To: SQLite mailing list
>Subject: Re: [sqlite] JOIN vs. INTERSECT speed and efficiency
>
>Although it may not translate as well to the more complex examples,
>would you also consider adding the IN operator to your tests? I found
>for example that "select v from t1 where v in t2;" did even better
>than the join or the intersect.
>
>
>
>
>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of R Smith
>Sent: Thursday, September 07, 2017 8:06 AM
>To: sqlite-users@mailinglists.sqlite.org
>Subject: Re: [sqlite] JOIN vs. INTERSECT speed and efficiency
>
>
>On 2017/09/07 3:43 AM, Keith Medcalf wrote:
>> Try the same test using 147 columns in each table.
>
>Exactly the plan for this weekend :)
>
>> 1 column is rather trivial.  Even a kindergarten kid could do it in
>no time using crayons and the wall.
>
>So? That is non-sequitur, I am sure given enough crayons, wall-space
>and
>time, a kindergarten kid can do it with 147 columns too. That says
>exactly nothing about the possible efficiencies of different methods.
>If
>however the 1-columness of the test gets somehow advantaged by being
>the
>PK (as Nico pointed out) or real world data such as TEXT entries sort
>slower than INTs, then it might affect it, so the 147 column tests
>will
>tell.
>
>
>> In other words except in very trivial cases (like having only one
>column that is not nullable) it will be very difficult to write a
>"correct" JOIN or correlated subquery that emulates an INTERSECT.
>
>Well I agree, but it is those trivial cases that are of interest
>here,
>and if there is a general JOIN optimization to be had. The INTERSECT
>test merely served as the catalyst to put us on the trail of the
>possible JOIN optimization, if there is even an optimization to be
>had
>(it might yet be a wild goose chase, which you seem to have your
>money
>on, so watch this space, I'll graciously accept your "told ya!" later
>after testing).
>
>
>Cheers,
>Ryan
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread David Raymond
Expanding things for when you get bored, in addition to  JOIN vs 
INTERSECT vs IN I'd also be interested in  JOIN vs EXCEPT vs NOT IN, as 
I tend to do more exclusion rather than intersection.

The straight up "IN tablename" may be SQLite only, but it also supports IN 
(subquery) so "select v from t1 where v in (select v from t2)" also works, 
although the "IN tablename" is so much cleaner to read in my opinion.

create table statesToSkip (
  state text primary key collate nocase
) without rowid;

select foo from bar where state not in statesToSkip.

vs

select foo from bar where state not in (select state from statesToSkip)

vs

select bar.foo from bar left outer join statesToSkip
on (bar.state = statesToSkip.state)
where statesToSkip.state is null;


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of R Smith
Sent: Thursday, September 07, 2017 3:51 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

On 2017/09/07 6:31 PM, David Raymond wrote:
> Although it may not translate as well to the more complex examples, would you 
> also consider adding the IN operator to your tests? I found for example that 
> "select v from t1 where v in t2;" did even better than the join or the 
> intersect.

Will do. The only thing I have somewhat against that specific query form 
is that it doesn't work in other engines (seems to not be standard). But 
for optimization in SQLite specifically that's perfect.

> Other "am I right in thinking this" question: INTERSECT is only going to be 
> viable when comparing full records, correct? If you're looking to filter 
> table A by whether its primary key is also a primary key for table B, but 
> ignoring the other fields in both, then INTERSECT becomes not an option, or 
> at least starts making the query more complex/ugly... correct?

INTERSECT will happily match however many columns you desire (and 
specify), there is no need to match full records or single keys 
specifically.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread Nico Williams
On Thu, Sep 07, 2017 at 09:51:07PM +0200, R Smith wrote:
> INTERSECT will happily match however many columns you desire (and specify),
> there is no need to match full records or single keys specifically.

But the two queries on either side of the set operator must have the
same number of columns (and in strongly-typed RDBMSes, the same types).

That's a huge constraint.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread R Smith

On 2017/09/07 6:31 PM, David Raymond wrote:

Although it may not translate as well to the more complex examples, would you also 
consider adding the IN operator to your tests? I found for example that "select v 
from t1 where v in t2;" did even better than the join or the intersect.


Will do. The only thing I have somewhat against that specific query form 
is that it doesn't work in other engines (seems to not be standard). But 
for optimization in SQLite specifically that's perfect.



Other "am I right in thinking this" question: INTERSECT is only going to be 
viable when comparing full records, correct? If you're looking to filter table A by 
whether its primary key is also a primary key for table B, but ignoring the other fields 
in both, then INTERSECT becomes not an option, or at least starts making the query more 
complex/ugly... correct?


INTERSECT will happily match however many columns you desire (and 
specify), there is no need to match full records or single keys 
specifically.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread David Raymond
Although it may not translate as well to the more complex examples, would you 
also consider adding the IN operator to your tests? I found for example that 
"select v from t1 where v in t2;" did even better than the join or the 
intersect.

Other "am I right in thinking this" question: INTERSECT is only going to be 
viable when comparing full records, correct? If you're looking to filter table 
A by whether its primary key is also a primary key for table B, but ignoring 
the other fields in both, then INTERSECT becomes not an option, or at least 
starts making the query more complex/ugly... correct?



-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of R Smith
Sent: Thursday, September 07, 2017 8:06 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] JOIN vs. INTERSECT speed and efficiency


On 2017/09/07 3:43 AM, Keith Medcalf wrote:
> Try the same test using 147 columns in each table.

Exactly the plan for this weekend :)

> 1 column is rather trivial.  Even a kindergarten kid could do it in no time 
> using crayons and the wall.

So? That is non-sequitur, I am sure given enough crayons, wall-space and 
time, a kindergarten kid can do it with 147 columns too. That says 
exactly nothing about the possible efficiencies of different methods. If 
however the 1-columness of the test gets somehow advantaged by being the 
PK (as Nico pointed out) or real world data such as TEXT entries sort 
slower than INTs, then it might affect it, so the 147 column tests will 
tell.


> In other words except in very trivial cases (like having only one column that 
> is not nullable) it will be very difficult to write a "correct" JOIN or 
> correlated subquery that emulates an INTERSECT.

Well I agree, but it is those trivial cases that are of interest here, 
and if there is a general JOIN optimization to be had. The INTERSECT 
test merely served as the catalyst to put us on the trail of the 
possible JOIN optimization, if there is even an optimization to be had 
(it might yet be a wild goose chase, which you seem to have your money 
on, so watch this space, I'll graciously accept your "told ya!" later 
after testing).


Cheers,
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread Nico Williams
On Wed, Sep 06, 2017 at 07:43:07PM -0600, Keith Medcalf wrote:
> Try the same test using 147 columns in each table.
> 
> 1 column is rather trivial.  Even a kindergarten kid could do it in no
> time using crayons and the wall.
> 
> [...]
> 
> In other words except in very trivial cases (like having only one
> column that is not nullable) it will be very difficult to write a
> "correct" JOIN or correlated subquery that emulates an INTERSECT.

Yup.  But that doesn't mean that the engine couldn't internally build a
result-set from the query without some filtering JOIN, then implement
the same strategy as an INTERSECT.  You can't do this in SQL if the
filter table has different shape than the result set, but the engine
might be able to do it.

On the other hand, building a complete result set first is... not online
behavior.  If the result set size is enormous, then the INTERSECT
approach is going to make the user very unhappy!

I do think OP's tests point out a case where SQLite3 is pessimally
picking table scan over covering index scan...

...though scanning the index
means that there will be no rowid column in the result, which might
actually be a compatibility issue when using rowid tables, so maybe
SQLite3 is doing exactly the right thing?

I don't think that pessimization is too consequential as users can
improve the situation by adding ORDER BY clauses or using WITHOUT ROWID.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-07 Thread R Smith


On 2017/09/07 3:43 AM, Keith Medcalf wrote:

Try the same test using 147 columns in each table.


Exactly the plan for this weekend :)


1 column is rather trivial.  Even a kindergarten kid could do it in no time 
using crayons and the wall.


So? That is non-sequitur, I am sure given enough crayons, wall-space and 
time, a kindergarten kid can do it with 147 columns too. That says 
exactly nothing about the possible efficiencies of different methods. If 
however the 1-columness of the test gets somehow advantaged by being the 
PK (as Nico pointed out) or real world data such as TEXT entries sort 
slower than INTs, then it might affect it, so the 147 column tests will 
tell.




In other words except in very trivial cases (like having only one column that is not 
nullable) it will be very difficult to write a "correct" JOIN or correlated 
subquery that emulates an INTERSECT.


Well I agree, but it is those trivial cases that are of interest here, 
and if there is a general JOIN optimization to be had. The INTERSECT 
test merely served as the catalyst to put us on the trail of the 
possible JOIN optimization, if there is even an optimization to be had 
(it might yet be a wild goose chase, which you seem to have your money 
on, so watch this space, I'll graciously accept your "told ya!" later 
after testing).



Cheers,
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread Keith Medcalf

Try the same test using 147 columns in each table.

1 column is rather trivial.  Even a kindergarten kid could do it in no time 
using crayons and the wall.

And of course the output of INTERSECT is ordered.  It uses a sorter to perform 
the intersection.  And of course the output is distinct, it uses a sorter to 
perform the intersection.

In other words,

select ... a bunch of columns ...
from table1
intersect
select ... an eual number of bunch of columns ...
from table2

is equivalent to

select ... the bunch of columns ...
  from table1
 where exists (select * from table2
where (for each column position in table 2 equals that column 
position from table1, plus of course all the added stuff needed to handle 
nullable columns))
group by ... the bunch of columns ...;

In other words except in very trivial cases (like having only one column that 
is not nullable) it will be very difficult to write a "correct" JOIN or 
correlated subquery that emulates an INTERSECT.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of R Smith
>Sent: Wednesday, 6 September, 2017 14:58
>To: sqlite-users@mailinglists.sqlite.org
>Subject: Re: [sqlite] JOIN vs. INTERSECT speed and efficiency
>
>On 2017/09/06 8:26 PM, Nico Williams wrote:
>> On Wed, Sep 06, 2017 at 08:05:57PM +0200, R Smith wrote:
>>> -- Another interesting thing to note: The INTERSECT test produces
>ORDERED
>>> -- output, which suggests that an ORDER-BY addition to the query
>would
>>> -- favour the INTERSECT method.
>> Nothing about INTERSECT requires it to produce ordered output.
>
>No, and it was not suggested, it was just noted that it does,
>suggesting
>that it could be the more performant choice when adding an ORDER BY
>clause, which turned out to not only be true in terms of being the
>better choice, but also that it itself sped up by simply adding the
>ORDER BY clause as was demonstrated in Test 6.
>
>> Nothing about the JOIN case makes it not possible to produce
>ordered
>> output by accident.
>
>Yet it doesn't seem to by accident, which would suggest that an ORDER
>BY
>clause when added to the JOIN statements would incur an additional
>time
>penalty for having to actually order the results - Yet, as again
>demonstrated in Test 6, the ORDER BY actually sped up the JOIN query
>too
>(perhaps via forcing the Index earlier or used in a different way) -
>which was most interesting, and, as you noted, there is nothing about
>the JOIN that precludes it from having ordered output, so this
>optimization might be worthwhile.
>
>> You'll want to re-measure with an ORDER BY added.
>
>I did. It was done in Test 6. It showed significantly interesting
>results. Was my explanation lacking in clarity or did it fall down
>the
>TLDR; rabbit hole? :)
>
>
>> In any case, this is quite interesting.  Many uses of JOIN are not
>> merely to filter results, but to construct joined result rows --
>such
>> uses of JOIN cannot be optimized by using INTERSECT.  But for
>> filter-uses of JOIN... this might be a useful optimization for the
>> engine to learn.
>
>I agree, and not only the INTERSECT optimization but the tests
>suggest
>adding a silent ORDER BY would also be an optimization, though not
>sure
>if the effort-to-pleasure ratio is low enough yet. Perhaps if re-
>doing
>the tests with tables using several more non-Integer columns to see
>if
>the optimization could be generalized across all kinds of data in
>some
>way. I might pursue this later when I have some time.
>
>
>Cheers,
>Ryan
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread Nico Williams
On Wed, Sep 06, 2017 at 11:54:35PM +0200, R Smith wrote:
> It's still remarkable that in both tests 5 and 6 I've used the very same PK
> setup, yet Test 6 was significantly faster with the added ORDER BY clause.
> In tests 1 through 4 I did not use a PK at all, just plain INT data field,
> but then I did not test the ORDER BY in those tests.

It's an optimizer issue.  It didn't occur to the optimizer that scanning
a covering index was better than scanning the table because the covering
index has (in this case, and always, for rowid tables anyways) strictly
less contents to read and decode.

Scanning the covering index has the happy side-effect (if you wanted if) of
producing ordered results and making an equivalent ORDER BY free.

(Whereas scanning the table will produce results in rowid order, which is
almost certainly not useful unless you explicitly wanted an INTEGER
PRIMARY KEY.)

Note that INTERSECT could have used a hash table, thus producing
unordered results (most likely).  But SQLite3 only knows b-trees.

All of this explains the accidental ordering / non-ordering.  And also
why you shouldn't count on it: it's all implementaton details!

But just because you know to add an ORDER BY doesn't mean you shouldn't
think to make it match some suitable index...  The optimizer is nice,
but you still have to think a little bit like an optimizer yourself :(

> It might turn out to be a wild goose chase, but that will be easily evident
> when testing without the PK and with more realistic real-world data. I'll do
> that this weekend.

You can't have a PK-less table -- SQLite3 always want some PK, even if
it's a hidden rowid column.  WITHOUT ROWID tables make this clearer:

  sqlite> create table t(a text) without rowid;
  Error: PRIMARY KEY missing on table t

Adding an explicit PK implicitly added the covering index that sped up
the JOIN (once you forced the optimizer to use it).  But you should just
always have had an explicit PK and WITHOUT ROWID.

You still found something interesting about using JOINs to filter result
sets (as opposed to adding columns to the the result set).  Something to
keep in mind...  I do a lot of queries where some of the JOINed tables
are used only for filtering.  It's not always possible to convert such
queries to INTERSECT, but it might be possible for SQLite3 to learn how
to perform the equivalent optimization internally, and when to do it.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread R Smith


On 2017/09/06 11:17 PM, Nico Williams wrote:


If you'll redo this I'd urge you to use WITHOUT ROWIDS.  First, that's
almost always the right thing to do anyways.  Second, it won't perform
worse but likely will perform better.  Third, write performance
definitely should improve with WITHOUT ROWIDS.  Fourth, I think users
are starting to use WITHOUT ROWIDS more, so testing that seems more
useful.

Nico


All good points which I will definitely heed. I will of course test both 
types of tables but ensure that when using row_id tables I do not 
inadvertently have possible PK short-circuiting of query plans.


It's still remarkable that in both tests 5 and 6 I've used the very same 
PK setup, yet Test 6 was significantly faster with the added ORDER BY 
clause. In tests 1 through 4 I did not use a PK at all, just plain INT 
data field, but then I did not test the ORDER BY in those tests.


It might turn out to be a wild goose chase, but that will be easily 
evident when testing without the PK and with more realistic real-world 
data. I'll do that this weekend.


Cheers,
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread Nico Williams
On Wed, Sep 06, 2017 at 10:57:41PM +0200, R Smith wrote:
> On 2017/09/06 8:26 PM, Nico Williams wrote:
> >On Wed, Sep 06, 2017 at 08:05:57PM +0200, R Smith wrote:
> >>-- Another interesting thing to note: The INTERSECT test produces ORDERED
> >>-- output, which suggests that an ORDER-BY addition to the query would
> >>-- favour the INTERSECT method.
> >Nothing about INTERSECT requires it to produce ordered output.
> 
> No, and it was not suggested, it was just noted that it does, suggesting
> that it could be the more performant choice when adding an ORDER BY clause,
> which turned out to not only be true in terms of being the better choice,
> but also that it itself sped up by simply adding the ORDER BY clause as was
> demonstrated in Test 6.

I point this out only because users should know not to assume result set
order without an ORDER BY.

> >Nothing about the JOIN case makes it not possible to produce ordered
> >output by accident.
> 
> Yet it doesn't seem to by accident, which would suggest that an ORDER BY
> clause when added to the JOIN statements would incur an additional time

See my explanation below.

> penalty for having to actually order the results - Yet, as again
> demonstrated in Test 6, the ORDER BY actually sped up the JOIN query too
> (perhaps via forcing the Index earlier or used in a different way) - which
> was most interesting, and, as you noted, there is nothing about the JOIN
> that precludes it from having ordered output, so this optimization might be
> worthwhile.
> 
> >You'll want to re-measure with an ORDER BY added.
> 
> I did. It was done in Test 6. It showed significantly interesting results.
> Was my explanation lacking in clarity or did it fall down the TLDR; rabbit
> hole? :)

I saw it.

I think the ORDER BY helped the JOIN because it caused SQLite3 to scan a
covering index (the primary) key instead of scanning the table.  That
without it SQLite3 didn't use that index is rather inefficient, though
it may not be a win in real-world use-cases to fix that.

Of course, IF you had used WITHOUT ROWIDs you would have found (I'm
sure) that the JOIN also produced ordered results by default and was as
fast as in your 6th test.

In fact, INTERSECT does an implicit ordering step by building a b-tree
that the JOIN with the index scan optimization does not have to build at
all, so JOIN has a leg up on INTERSECT in that sense.

> I agree, and not only the INTERSECT optimization but the tests suggest
> adding a silent ORDER BY would also be an optimization, though not sure if

I think here the ORDER BY merely forced SQLite3 to pick the more
efficient query plan, and that it's probably a (rather minor) optimizer
bug that it didn't do so to begin with without the ORDER BY.

> the effort-to-pleasure ratio is low enough yet. Perhaps if re-doing the
> tests with tables using several more non-Integer columns to see if the
> optimization could be generalized across all kinds of data in some way. I
> might pursue this later when I have some time.

If you'll redo this I'd urge you to use WITHOUT ROWIDS.  First, that's
almost always the right thing to do anyways.  Second, it won't perform
worse but likely will perform better.  Third, write performance
definitely should improve with WITHOUT ROWIDS.  Fourth, I think users
are starting to use WITHOUT ROWIDS more, so testing that seems more
useful.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread R Smith

On 2017/09/06 8:26 PM, Nico Williams wrote:

On Wed, Sep 06, 2017 at 08:05:57PM +0200, R Smith wrote:

-- Another interesting thing to note: The INTERSECT test produces ORDERED
-- output, which suggests that an ORDER-BY addition to the query would
-- favour the INTERSECT method.

Nothing about INTERSECT requires it to produce ordered output.


No, and it was not suggested, it was just noted that it does, suggesting 
that it could be the more performant choice when adding an ORDER BY 
clause, which turned out to not only be true in terms of being the 
better choice, but also that it itself sped up by simply adding the 
ORDER BY clause as was demonstrated in Test 6.



Nothing about the JOIN case makes it not possible to produce ordered
output by accident.


Yet it doesn't seem to by accident, which would suggest that an ORDER BY 
clause when added to the JOIN statements would incur an additional time 
penalty for having to actually order the results - Yet, as again 
demonstrated in Test 6, the ORDER BY actually sped up the JOIN query too 
(perhaps via forcing the Index earlier or used in a different way) - 
which was most interesting, and, as you noted, there is nothing about 
the JOIN that precludes it from having ordered output, so this 
optimization might be worthwhile.



You'll want to re-measure with an ORDER BY added.


I did. It was done in Test 6. It showed significantly interesting 
results. Was my explanation lacking in clarity or did it fall down the 
TLDR; rabbit hole? :)




In any case, this is quite interesting.  Many uses of JOIN are not
merely to filter results, but to construct joined result rows -- such
uses of JOIN cannot be optimized by using INTERSECT.  But for
filter-uses of JOIN... this might be a useful optimization for the
engine to learn.


I agree, and not only the INTERSECT optimization but the tests suggest 
adding a silent ORDER BY would also be an optimization, though not sure 
if the effort-to-pleasure ratio is low enough yet. Perhaps if re-doing 
the tests with tables using several more non-Integer columns to see if 
the optimization could be generalized across all kinds of data in some 
way. I might pursue this later when I have some time.



Cheers,
Ryan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread Nico Williams
On Wed, Sep 06, 2017 at 08:05:57PM +0200, R Smith wrote:
> -- Another interesting thing to note: The INTERSECT test produces ORDERED
> -- output, which suggests that an ORDER-BY addition to the query would
> -- favour the INTERSECT method.

Nothing about INTERSECT requires it to produce ordered output.

Nothing about the JOIN case makes it not possible to produce ordered
output by accident.

You'll want to re-measure with an ORDER BY added.

In any case, this is quite interesting.  Many uses of JOIN are not
merely to filter results, but to construct joined result rows -- such
uses of JOIN cannot be optimized by using INTERSECT.  But for
filter-uses of JOIN... this might be a useful optimization for the
engine to learn.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN vs. INTERSECT speed and efficiency

2017-09-06 Thread R Smith

Hi all,

For those interested, after a recent thread from a poster called Joe 
asking about the most efficient way to find values that coincide from 
two separate tables, a response from Clemens Ladisch and a further 
elaboration from myself suggested the following:


SELECT v FROM t1 INTERSECT SELECT v FROM t2

I was going to mention that that INTERSECT is simply equivalent to a 
JOIN on v (if there are no duplicate entries) of the form:


SELECT v FROM t1 JOIN t2 ON t2.v = t1.v

But then the question arose: Which is more efficient? - So at the time 
of posting I refrained from mentioning it, but it kept in my mind, until 
I decided to try and answer the question.


Next step was to set up an experiment to mimic the above and test it.


*Experiment* - If the reader is interested, or would like to compare and 
contrast with possibly different DB engines, versions, platforms or 
settings, read on, else just skip ahead to the Conclusion bit.


Setup:
- Using SQLite.
- Ran on my slowest (to magnify efficiency difference) notebook (Win10 
64bit)
- using sqlitespeed (http://www.sqlc.rifin.co.za/) but you can use any 
system since the objective measurement here is qualitative difference 
ratio, not quantitative direct units.
- with the standard pre-compiled 32-bit SQLite DLL available from the 
SQLite download page (https://sqlite.org/download.html)

- test DLL version: 3.17.0


Tests:

I will post the first test script in full with added annotation to 
explain the rationale and show the full working, but the subsequent 
scripts I will only post the money-shots and avoid the repetitive parts.
Note that the test script is run in a transaction which is not shown 
explicitly and at the end rolled back to avoid differences in tests on 
account of the file content changing during the tests.


Test 1:

  -- SQLite version 3.17.0  [ Release: 2017-02-13 ]  on SQLitespeed 
version 2.0.2.4.
  -- 



-- This first table holds Parameters for testing.
-- idxCount specifies how many entries to make in the test tables. (i.e. 
Number of added rows).
-- rndDiv is the modality chosen for the random values to control the 
magnitude of random values
-- so that the ratio of expected matching values to value-count can be 
controlled.


CREATE TABLE p (
  idxCount INT,
  rndDiv INT
);

-- Here we add the Test parameter values. This is the only section that 
changes between

-- the first 4 tests.
-- This test has a row-count of 10 (One-Hundred-Thousand) and a 
modality of 1000 times the
-- idxCount so that in all the rows, we expect roughly only about 0.1% 
of matches.
-- Note that if every value in table 1 has a 1-in-1000 chance of having 
a matching valuein
-- table 2, then for 100,000 rows there should be ~100 matches which is 
~0.1% of the row-count.


-- This test was picked as the full example since it produces the least 
amount of rows.


INSERT INTO p(idxCount, rndDiv) VALUES (10, 1000 * 10);

-- Confirm the parameters.
SELECT * FROM p;

  --   idxCount   |    rndDiv
  --  | 
  --    10    |   1


-- Following are the two test tables t1 and t2 each containing one 
column v to hold the random values.

CREATE TABLE t1 (
  v INT
);

CREATE TABLE t2 (
  v INT
);

-- Here we fill the two test tables according to our parameters above.
WITH R(idx,Val) AS (
    SELECT 0, 1000
    UNION ALL
    SELECT idx+1, abs(random() % p.rndDiv) FROM R,p WHERE idx < p.idxCount
)
INSERT INTO t1(v) SELECT Val FROM R
;

WITH R(idx,Val) AS (
    SELECT 0, 1000
    UNION ALL
    SELECT idx+1, abs(random() % p.rndDiv) FROM R,p WHERE idx < p.idxCount
)
INSERT INTO t2(v) SELECT Val FROM R
;

-- Test 1 - using JOIN.
-- Note that this is the second iteration of tests. In the first 
iteration I had the INTERSECT
-- query before the JOIN query, but I swapped it around for the second 
iteration because
-- the JOIN was generally faster for low match counts so I wanted to 
ensure neither test
-- gained an advantage from being second due to some or other caching by 
the Query planner.


SELECT t1.v FROM t1 JOIN t2 ON t2.v = t1.v;

  -- v
  -- 
  -- 1000
  --  2312703
  -- 87720925
  -- 29736409
  -- 62527166
  -- 25171143
  -- 24168552
  -- 86449735
  -- 83544235
  -- 45671286
  -- 69343788
  -- 42394827
  -- 92142603
  -- 87106564
  --  4593574
  --  6914348
  -- 16358938
  -- 12568863
  -- 20105830
  -- 91354724
  -- 87992157
  -- 17605134
  -- 28584588
  -- 78633251
  -- 98955905
  -- 19979768
  -- 20956231
  -- 30819730
  -- 93942875
  -- 45346494
  -- 96346064
  -- 32224203
  -- 89622511
  -- 39267531
  --  3116133
  -- 31172079
  -- 87828771
  -- 82931503
  -- 89108957
  -- 80067973
  -- 89366000
  -- 68319117
  -- 37802556
  -- 64391927
  -- 84515054
  -- 11071461
  -- 40682706
  -- 78441313
  -- 17977211
  --   659811
  -- 14504321
  -- 57479870
  -- 44134958
  -- 94642155
  -- 37520503
  -- 

Re: [sqlite] Join, Union, Subquery or what?

2017-09-06 Thread Joe

Am 06.09.2017 um 14:32 schrieb R Smith:
These suggestions from Clemens will work exactly as you want, but I 
need to add that it assumes the records all perfectly match between 
the tables, even flags, ID column etc.


This means, if it doesn't work as you expect, you can still use the 
exact same methods but you can of course include the relevant columns 
(that should or shouldn't match between tables) in stead of the * 
wildcard.


As an example based on Clemens' first example, the query:

SELECT DDatum FROM Katalog EXCEPT SELECT DDatum FROM ZKatalog;

will produce a list of all dates in Katalog that doesn't have a 
corresponding date in ZKatalog.


Cheers,
Ryan


On 2017/09/06 1:09 PM, Clemens Ladisch wrote:

Joe wrote:
my SQLite database has two tables Katalog and ZKatalog with the same 
structure. One of the columns  is called DDatum. What's the most 
efficient way to


(1) Select records, which are only in Katalog, but not in ZKatalog?

SELECT * FROM Katalog EXCEPT SELECT * FROM ZKatalog;


(2) Select records, which are in Katalog and in ZKatalog?

SELECT * FROM Katalog INTERSECT SELECT * FROM ZKatalog;
SELECT * FROM Katalog UNION SELECT * FROM ZKatalog;
(depending on the exact meaning of the word "and" in your question)

(3) Select records, which are with same column values except DDatum 
in Katalog and in ZKatalog?

SELECT *
FROM Katalog
WHERE the_primary_key_column IN (
   SELECT the_primary_key_column FROM (
 SELECT all,columns,except,ddatum FROM Katalog
 INTERSECT
 SELECT all,columns,except,ddatum FROM ZKatalog
   )
);

(If you do not need the DDatum values, use only the inner subquery.)


(4) Select records from Katalog and ZKatalog with same DDatum content?

SELECT * FROM Katalog WHERE DDatum IN (SELECT DDatum FROM ZKatalog);


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

.
Clemens and Ryan, this helps a lot.
Thank you --  Joe
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join, Union, Subquery or what?

2017-09-06 Thread Clemens Ladisch
Joe wrote:
> my SQLite database has two tables Katalog and ZKatalog with the same 
> structure. One of the columns  is called DDatum. What's the most efficient 
> way to
>
> (1) Select records, which are only in Katalog, but not in ZKatalog?

SELECT * FROM Katalog EXCEPT SELECT * FROM ZKatalog;

> (2) Select records, which are in Katalog and in ZKatalog?

SELECT * FROM Katalog INTERSECT SELECT * FROM ZKatalog;
SELECT * FROM Katalog UNION SELECT * FROM ZKatalog;
(depending on the exact meaning of the word "and" in your question)

> (3) Select records, which are with same column values except DDatum in 
> Katalog and in ZKatalog?

SELECT *
FROM Katalog
WHERE the_primary_key_column IN (
  SELECT the_primary_key_column FROM (
SELECT all,columns,except,ddatum FROM Katalog
INTERSECT
SELECT all,columns,except,ddatum FROM ZKatalog
  )
);

(If you do not need the DDatum values, use only the inner subquery.)

> (4) Select records from Katalog and ZKatalog with same DDatum content?

SELECT * FROM Katalog WHERE DDatum IN (SELECT DDatum FROM ZKatalog);


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join, Union, Subquery or what?

2017-09-06 Thread Joe

Hi, all,

my SQLite database has two tables Katalog and ZKatalog with the same 
structure. One of the columns  is called DDatum. What's the most 
efficient way to


(1) Select records, which are only in Katalog, but not in ZKatalog?
(2) Select records, which are in Katalog and in ZKatalog?
(3) Select records, which are with same column values except DDatum in 
Katalog and in ZKatalog?

(4) Select records from Katalog and ZKatalog with same DDatum content?

Thanx --  Joe
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join of two virtual tables returns incorrect result set in 3.8.4.1

2014-04-03 Thread Hinrichsen, John
That was a fast turn-around.  Thank you for addressing this issue so
quickly!

-- 

This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee, you should not 
disseminate, distribute, alter or copy this e-mail. Please notify the 
sender immediately by e-mail if you have received this e-mail by mistake 
and delete this e-mail from your system. E-mail transmissions cannot be 
guaranteed to be secure or without error as information could be 
intercepted, corrupted, lost, destroyed, or arrive late or incomplete. The 
sender, therefore, does not accept liability for any errors or omissions in 
the contents of this message which arise during or as a result of e-mail 
transmission. If verification is required, please request a hard-copy 
version. This message is provided for information purposes and should not 
be construed as a solicitation or offer to buy or sell any securities or 
related financial instruments in any jurisdiction.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join of two virtual tables returns incorrect result set in 3.8.4.1

2014-04-03 Thread Richard Hipp
Trouble ticket for this problem here:
http://www.sqlite.org/src/info/98825a79ce145686392d8074032ae54863aa21a3


On Thu, Apr 3, 2014 at 9:24 AM, Hinrichsen, John wrote:

> I am posting my C repro for the virtual table join issue inline, below:
>
> /*
>  * This repro is intended to demonstrate a possible bug when joining
>  * two virtual table instances together: only a subset of the expected
>  * rows is returned by sqlite.  As far as we can tell, it only happens
>  * when the virtual tables are themselves backed by nontrivial sqlite
>  * queries.
>  *
>  * This problem occurs with sqlite-3.8.4.1.
>  * It does not occur with sqlite-3.8.3.1 or earlier versions.
>  *
>  * The output of the program first shows the result of a join using
>  * two virtual table instances; the second result shows the output
>  * of the equivalent join on two non-virtual tables created from
>  * 'SELECT * FROM' the two respective virtual table instances.
>  */
>
> #include 
> #include 
> #include 
> #include 
>
> static sqlite3 * s_db = NULL;
>
> static char * remove_optional_quotes(const char * const input)
> {
> char * ret;
> const int len = strlen(input);
>
> if (input[0] == input[len-1] && (input[0] == '"' || input[0] == '\''))
> {
> ret = sqlite3_malloc(len - 1);
> strncpy(ret, input+1, len-2);
> ret[len-2] = '\0';
> }
> else
> {
> ret = sqlite3_malloc(len + 1);
> strncpy(ret, input, len);
> ret[len] = '\0';
> }
> return ret;
> }
>
> struct test_vtab
> {
> sqlite3_vtab base;
> char * query;
> };
>
> static int repro_create_connect(sqlite3 * db, void * pAux, int argc, const
> char * const argv[], sqlite3_vtab ** ppVTab, char ** pzErr)
> {
> int err;
> char * unquoted_schema;
>
> * ppVTab = sqlite3_malloc(sizeof(struct test_vtab));
> if (!* ppVTab)
> return SQLITE_NOMEM;
> memset(* ppVTab, 0, sizeof(** ppVTab));
>
> struct test_vtab * vtab = (struct test_vtab *) * ppVTab;
> vtab->query = remove_optional_quotes(argv[4]);
>
> unquoted_schema = remove_optional_quotes(argv[3]);
> err = sqlite3_declare_vtab(db, unquoted_schema);
> sqlite3_free(unquoted_schema);
>
> if (err != SQLITE_OK) {
> sqlite3_free(vtab->query);
> sqlite3_free(vtab);
> * ppVTab = NULL;
> return err;
> }
>
> return SQLITE_OK;
> }
>
> static int bitcount(unsigned int n)
> {
> int count = 0;
> while (n)
> {
> ++count;
> n &= (n - 1);
> }
> return count;
> }
>
> static int test_bit(const int value, const int bit)
> {
> const int mask = 1 << bit;
>
> return ((value & mask) == mask);
> }
>
> static int repro_best_index(sqlite3_vtab * pVTab, sqlite3_index_info *
> index_info)
> {
> const int column_count = 2;
>
> int index_number = 0, constraint_index, column_index;
>
> int constraint_by_column[2];
>
> for (constraint_index = 0; constraint_index < index_info->nConstraint;
> ++constraint_index)
> {
> const struct sqlite3_index_constraint * constraint =
> _info->aConstraint[constraint_index];
>
> const int is_usable = (constraint->usable != 0);
>
> if (is_usable)
> {
> const int is_indexed_column = (constraint->iColumn >= 0) &&
> (constraint->iColumn < column_count);
> const int is_equality   = (constraint->op ==
> SQLITE_INDEX_CONSTRAINT_EQ);
>
> if (is_indexed_column && is_equality)
> {
> constraint_by_column[constraint->iColumn] =
> constraint_index;
>
> index_number |= (1 << constraint->iColumn);
> }
> }
> }
>
> int argument_index = 0;
>
> for (column_index = 0; column_index < column_count; ++column_index)
> {
> const int using_column = test_bit(index_number, column_index);
>
> if (using_column)
> {
> const int constraint_index =
> constraint_by_column[column_index];
>
> assert(constraint_index < index_info->nConstraint);
>
> struct sqlite3_index_constraint_usage * usage =
> _info->aConstraintUsage[constraint_index];
>
> usage->argvIndex = argument_index + 1;
> usage->omit  = 0;
>
> ++argument_index;
> }
> }
>
> index_info->estimatedCost = 1.0;
>
> for (column_index = 0; column_index < column_count; ++column_index)
> {
> const int using_column = test_bit(index_number, column_index);
>
> if (! using_column)
> {
> const double penalty = 10.0 + (column_count - column_index);
>
> index_info->estimatedCost *= penalty;
> }
> }
>
> index_info->idxNum = index_number;
> index_info->idxStr = "";
> index_info->needToFreeIdxStr = 0;
> index_info->orderByConsumed = 0;
>
> return SQLITE_OK;
> }
>
> static int repro_disconnect_destroy(sqlite3_vtab * pVTab)
> {
> 

[sqlite] Join of two virtual tables returns incorrect result set in 3.8.4.1

2014-04-03 Thread Hinrichsen, John
I am posting my C repro for the virtual table join issue inline, below:

/*
 * This repro is intended to demonstrate a possible bug when joining
 * two virtual table instances together: only a subset of the expected
 * rows is returned by sqlite.  As far as we can tell, it only happens
 * when the virtual tables are themselves backed by nontrivial sqlite
 * queries.
 *
 * This problem occurs with sqlite-3.8.4.1.
 * It does not occur with sqlite-3.8.3.1 or earlier versions.
 *
 * The output of the program first shows the result of a join using
 * two virtual table instances; the second result shows the output
 * of the equivalent join on two non-virtual tables created from
 * 'SELECT * FROM' the two respective virtual table instances.
 */

#include 
#include 
#include 
#include 

static sqlite3 * s_db = NULL;

static char * remove_optional_quotes(const char * const input)
{
char * ret;
const int len = strlen(input);

if (input[0] == input[len-1] && (input[0] == '"' || input[0] == '\''))
{
ret = sqlite3_malloc(len - 1);
strncpy(ret, input+1, len-2);
ret[len-2] = '\0';
}
else
{
ret = sqlite3_malloc(len + 1);
strncpy(ret, input, len);
ret[len] = '\0';
}
return ret;
}

struct test_vtab
{
sqlite3_vtab base;
char * query;
};

static int repro_create_connect(sqlite3 * db, void * pAux, int argc, const
char * const argv[], sqlite3_vtab ** ppVTab, char ** pzErr)
{
int err;
char * unquoted_schema;

* ppVTab = sqlite3_malloc(sizeof(struct test_vtab));
if (!* ppVTab)
return SQLITE_NOMEM;
memset(* ppVTab, 0, sizeof(** ppVTab));

struct test_vtab * vtab = (struct test_vtab *) * ppVTab;
vtab->query = remove_optional_quotes(argv[4]);

unquoted_schema = remove_optional_quotes(argv[3]);
err = sqlite3_declare_vtab(db, unquoted_schema);
sqlite3_free(unquoted_schema);

if (err != SQLITE_OK) {
sqlite3_free(vtab->query);
sqlite3_free(vtab);
* ppVTab = NULL;
return err;
}

return SQLITE_OK;
}

static int bitcount(unsigned int n)
{
int count = 0;
while (n)
{
++count;
n &= (n - 1);
}
return count;
}

static int test_bit(const int value, const int bit)
{
const int mask = 1 << bit;

return ((value & mask) == mask);
}

static int repro_best_index(sqlite3_vtab * pVTab, sqlite3_index_info *
index_info)
{
const int column_count = 2;

int index_number = 0, constraint_index, column_index;

int constraint_by_column[2];

for (constraint_index = 0; constraint_index < index_info->nConstraint;
++constraint_index)
{
const struct sqlite3_index_constraint * constraint =
_info->aConstraint[constraint_index];

const int is_usable = (constraint->usable != 0);

if (is_usable)
{
const int is_indexed_column = (constraint->iColumn >= 0) &&
(constraint->iColumn < column_count);
const int is_equality   = (constraint->op ==
SQLITE_INDEX_CONSTRAINT_EQ);

if (is_indexed_column && is_equality)
{
constraint_by_column[constraint->iColumn] =
constraint_index;

index_number |= (1 << constraint->iColumn);
}
}
}

int argument_index = 0;

for (column_index = 0; column_index < column_count; ++column_index)
{
const int using_column = test_bit(index_number, column_index);

if (using_column)
{
const int constraint_index = constraint_by_column[column_index];

assert(constraint_index < index_info->nConstraint);

struct sqlite3_index_constraint_usage * usage =
_info->aConstraintUsage[constraint_index];

usage->argvIndex = argument_index + 1;
usage->omit  = 0;

++argument_index;
}
}

index_info->estimatedCost = 1.0;

for (column_index = 0; column_index < column_count; ++column_index)
{
const int using_column = test_bit(index_number, column_index);

if (! using_column)
{
const double penalty = 10.0 + (column_count - column_index);

index_info->estimatedCost *= penalty;
}
}

index_info->idxNum = index_number;
index_info->idxStr = "";
index_info->needToFreeIdxStr = 0;
index_info->orderByConsumed = 0;

return SQLITE_OK;
}

static int repro_disconnect_destroy(sqlite3_vtab * pVTab)
{
struct test_vtab * vtab = (struct test_vtab *) pVTab;

sqlite3_free(vtab->query);
sqlite3_free(vtab);
return SQLITE_OK;
}

struct test_cursor
{
sqlite3_vtab_cursor base;
sqlite3_stmt * stmt;
int row;
};

static int repro_open(sqlite3_vtab * pVTab, sqlite3_vtab_cursor ** ppCursor)
{
struct test_cursor * cursor;

cursor = sqlite3_malloc(sizeof(struct test_cursor));
if (!cursor)
return SQLITE_NOMEM;
memset(cursor, 0, sizeof(* cursor));

* ppCursor = >base;

 

Re: [sqlite] Join of two virtual tables returns incorrect result set in 3.8.4.1

2014-04-02 Thread Donald Griggs
Attachments can't appear on this list.   You can use a shared file service
and post a link, or for smallish amounts of text use something like
pastbin.com.


On Wed, Apr 2, 2014 at 6:42 PM, Andy Goth  wrote:

> On 4/2/2014 4:52 PM, Hinrichsen, John wrote:
>
>> sqlite 3.8.4.1 can return an incorrect result when joining two virtual
>> tables that are themselves based on underlying sqlite tables.
>>
>> This problem does not happen with sqlite 3.8.3.1 or earlier.
>>
>> Please see the attached repro.
>>
>
> Attachment appears to be missing.
>
> --
> Andy Goth | 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join of two virtual tables returns incorrect result set in 3.8.4.1

2014-04-02 Thread Andy Goth

On 4/2/2014 4:52 PM, Hinrichsen, John wrote:

sqlite 3.8.4.1 can return an incorrect result when joining two virtual
tables that are themselves based on underlying sqlite tables.

This problem does not happen with sqlite 3.8.3.1 or earlier.

Please see the attached repro.


Attachment appears to be missing.

--
Andy Goth | 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join of two virtual tables returns incorrect result set in 3.8.4.1

2014-04-02 Thread Hinrichsen, John
sqlite 3.8.4.1 can return an incorrect result when joining two virtual
tables that are themselves based on underlying sqlite tables.

This problem does not happen with sqlite 3.8.3.1 or earlier.

Please see the attached repro.

-- 

This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee, you should not 
disseminate, distribute, alter or copy this e-mail. Please notify the 
sender immediately by e-mail if you have received this e-mail by mistake 
and delete this e-mail from your system. E-mail transmissions cannot be 
guaranteed to be secure or without error as information could be 
intercepted, corrupted, lost, destroyed, or arrive late or incomplete. The 
sender, therefore, does not accept liability for any errors or omissions in 
the contents of this message which arise during or as a result of e-mail 
transmission. If verification is required, please request a hard-copy 
version. This message is provided for information purposes and should not 
be construed as a solicitation or offer to buy or sell any securities or 
related financial instruments in any jurisdiction.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN failure in sqlite 3.8.1

2013-12-05 Thread Mark Brand
Hi,

I noticed that some of my views were not working with sqlite 3.8.1. I managed 
to isolate the problem in the simple test case below.

Then I discovered that the 3.8.2 pre-release seems to do it right. I suspect 
it was fixed along with  https://www.sqlite.org/src/tktview?name=c620261b5b
but I don't know for sure if this boils down to the same problem.

In this case, the problem seems to be triggered by the combination of a LEFT 
JOIN to an empty set together with a JOIN on the primary key column of a 
table, and can be strangely avoided with a CAST.

CREATE TABLE T (i INTEGER PRIMARY KEY);

INSERT INTO T (i) VALUES (1968);

--This join fails to return any rows in sqlite 3.8.1.
--Succeeds in 3.7.17.
--Succeeds in 3.8.2 2013-11-27 pre-release
SELECT A.i, typeof(T.i) t_type, typeof(A.i) a_type
FROM (SELECT 1968 i) A
LEFT JOIN (SELECT 1968 i WHERE 0<>0) B
ON B.i=A.i
JOIN  T
ON T.i = A.i
;

--But a CAST() makes the join succeed, even though both types are already 
integer.
SELECT A.i, typeof(T.i) t_type, typeof(A.i) a_type
FROM (SELECT 1968 i) A
LEFT JOIN (SELECT 1968 i WHERE 0<>0) B
ON B.i=A.i
JOIN  T
ON CAST(T.i AS INTEGER)=A.i
;

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-22 Thread TAUZIN Mathieu
Hi,

You're wrong.
I think I've found the bug. 
It is a parser issue.

According to their definition 
(http://sqlite.org/syntaxdiagrams.html#single-source) , Join sources (named 
single-source) are either : 
* a table or view with an optional alias and/or with an optional index
* a sub query with an optional alias
* a sub join (with no alias)

In SQLite parser.y source code we can find on line 496 the grammar rule 
handling those three cases (in the same order)


...
seltablist(A) ::= stl_prefix(X) nm(Y) dbnm(D) as(Z) indexed_opt(I) on_opt(N) 
using_opt(U). {
  A = sqlite3SrcListAppendFromTerm(pParse,X0,N,U);
  sqlite3SrcListIndexedBy(pParse, A, );
}

seltablist(A) ::= stl_prefix(X) LP select(S) RP
as(Z) on_opt(N) using_opt(U). {
A = sqlite3SrcListAppendFromTerm(pParse,X,0,0,,S,N,U);
  }

seltablist(A) ::= stl_prefix(X) LP seltablist(F) RP
as(Z) on_opt(N) using_opt(U). {
if( X==0 && Z.n==0 && N==0 && U==0 ){
  A = F;
}else{
  Select *pSubquery;
  sqlite3SrcListShiftJoinType(F);
  pSubquery = sqlite3SelectNew(pParse,0,F,0,0,0,0,0,0,0);
  A = sqlite3SrcListAppendFromTerm(pParse,X,0,0,,pSubquery,N,U);
}
  }
...


Case 1 and 2 are handled properly but as you can see the third definition (wich 
should deal with sub joins) contains mistakes :
#1 : It allows an as clause after the parenthesis
#2 : on the right of a join operator (else { ... }) it generates a 
subquery instead of merging F (which is a seltabList, not a sub query) with X 
into A.

Do you still think there is no issue here ?

I wish I could propose a fix but I have no skills in C/yacc.

Hope this will help anyway.

Thanks

-Message d'origine-
De : sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
De la part de Nico Williams
Envoyé : lundi 19 mars 2012 16:10
À : General Discussion of SQLite Database
Objet : Re: [sqlite] Join-source issue when using sub '(' join-source ')'

On Mon, Mar 19, 2012 at 10:02 AM, TAUZIN Mathieu <mtau...@cegid.fr> wrote:
> Thanks,
>
> This syntax works but it is not documented... it looks like a short hand for 
> a subquery, interesting !.

Join sources are like sub-queries.  Look at the syntax.

A sub-select specified in the join-source following the FROM clause in a simple 
SELECT statement is handled as if it was a table containing the data returned 
by executing the sub-select statement.  The docs could perhaps be clearer about 
this.  This sentence:

"A sub-select specified in the join-source following the FROM clause in a 
simple SELECT statement is handled as if it was a table containing the data 
returned by executing the sub-select statement. "

does hint at this, but maybe that's just because I think of  as 
a sort of sub-select.

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Ryan Johnson

On 19/03/2012 12:07 PM, TAUZIN Mathieu wrote:

Thanks for your support !



SQL Ansi  (and every major DB 
SqlServer, 
Oracle) 
supports this syntax as described in SQLite documentation.



The parenthesis are here to enforce the priority of the joins.
FYI, sqlite3 does not respect parentheses as a way of enforcing join 
orders. If you want to bypass the optimizer, the accepted way (as with 
postgres, I believe) is to use a CROSS JOIN rather than the default 
[INNER] JOIN. You may want to test whether the above-mentioned engines 
actually respect parentheses for join ordering.


Whether the parenthesis as used should or should not introduce a scope, 
I'm not prepared to make any claims about...


Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread TAUZIN Mathieu
Thanks for your support !



SQL Ansi<http://savage.net.au/SQL/sql-99.bnf.html#qualified%20join> (and every 
major DB 
SqlServer<http://msdn.microsoft.com/en-US/library/ms177634(v=sql.90).aspx>, 
Oracle<http://docs.oracle.com/cd/B13789_01/server.101/b10759/statements_10002.htm>)
 supports this syntax as described in SQLite documentation.



The parenthesis are here to enforce the priority of the joins.



If the subjoined objects are no longer accessible outside the parenthesis what 
would be the interest of such a syntax ?

Therefore this query works fine :


SELECT  Orders.OrderID
FROM (Customers
  INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID )
  INNER JOIN InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID
WHERE 'ALFKI' = Customers.CustomerID



Whereas that one doesn't :


SELECT  Orders.OrderID
FROM Customers
  INNER JOIN
(Orders
  INNER JOIN InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID)
  ON Customers.CustomerID = Orders.CustomerID
WHERE 'ALFKI' = Customers.CustomerID



So, sub-joins on the left of the join-type are OK

but sub-joins on the right of the joint-type are NOT OK.



It seems like a bug to me.



Mathieu



-Message d'origine-
De : sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
De la part de Jay A. Kreibich
Envoyé : lundi 19 mars 2012 16:26
À : General Discussion of SQLite Database
Objet : Re: [sqlite] Join-source issue when using sub '(' join-source ')'



On Mon, Mar 19, 2012 at 12:03:44PM +, TAUZIN Mathieu scratched on the wall:

> Hi,

>

> According to the documentation on SELECT statements

> http://www.sqlite.org/lang_select.html

> It seems possible to write join chains as A join (B join C).

> (using a '(' join-source ')' single-source )

>

> But on the well known NorthwindEF database this query ...

>

> SELECT Orders.OrderID

> FROM Customers

> INNER JOIN

> (Orders

> LEFT OUTER JOIN InternationalOrders

> ON Orders.OrderID = InternationalOrders.OrderID

> )

> ON Customers.CustomerID = Orders.CustomerID WHERE 'ALFKI' =

> Customers.CustomerID

>

> ... raises an error :

> no such column: Orders.OrderID



  This does appear to be a bug.



  You can get around this using an AS clause to name the

  sub-expression:



SELECT OrdInt.OrderID

  FROM Customers

  INNER JOIN (

Orders

   LEFT OUTER JOIN

   InternationalOrders

   ON Orders.OrderID = InternationalOrders.OrderID

  ) AS OrdInt

  ON Customers.CustomerID = OrdInt.CustomerID

WHERE 'ALFKI' = OrdInt.CustomerID



  The thing is, you're not supposed to need to name a sub-expression.

  In fact, according the the "single-source" syntax diagram, naming a

  sub-expression (via AS) isn't even allowed.



  To be clear, a sub-*select* that is used as a source can be (and, in

  fact, must be) named to access it outside of the sub-select, but a

  sub-*expression*-- where the parenthesis only serve to enforce order

  of operations-- should expose the contained tables, just as if the

  parenthesis were not there.



   -j



--

Jay A. Kreibich < J A Y  @  K R E I B I.C H >



"Intelligence is like underwear: it is important that you have it,  but showing 
it to the wrong people has the tendency to make them  feel uncomfortable." -- 
Angela Johnson ___

sqlite-users mailing list

sqlite-users@sqlite.org<mailto:sqlite-users@sqlite.org>

http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Jay A. Kreibich
On Mon, Mar 19, 2012 at 12:03:44PM +, TAUZIN Mathieu scratched on the wall:
> Hi,
> 
> According to the documentation on SELECT statements
> http://www.sqlite.org/lang_select.html
> It seems possible to write join chains as A join (B join C).
> (using a '(' join-source ')' single-source )
> 
> But on the well known NorthwindEF database this query ...
> 
> SELECT Orders.OrderID
> FROM Customers
> INNER JOIN
> (Orders
> LEFT OUTER JOIN InternationalOrders
> ON Orders.OrderID = InternationalOrders.OrderID
> )
> ON Customers.CustomerID = Orders.CustomerID
> WHERE 'ALFKI' = Customers.CustomerID
> 
> ... raises an error :
> no such column: Orders.OrderID

  This does appear to be a bug.

  You can get around this using an AS clause to name the
  sub-expression:

SELECT OrdInt.OrderID
  FROM Customers
  INNER JOIN (
Orders
LEFT OUTER JOIN 
InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID
  ) AS OrdInt
  ON Customers.CustomerID = OrdInt.CustomerID
 WHERE 'ALFKI' = OrdInt.CustomerID

  The thing is, you're not supposed to need to name a sub-expression.
  In fact, according the the "single-source" syntax diagram, naming a
  sub-expression (via AS) isn't even allowed.

  To be clear, a sub-*select* that is used as a source can be (and, in
  fact, must be) named to access it outside of the sub-select, but a
  sub-*expression*-- where the parenthesis only serve to enforce order
  of operations-- should expose the contained tables, just as if the
  parenthesis were not there.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Nico Williams
On Mon, Mar 19, 2012 at 10:02 AM, TAUZIN Mathieu  wrote:
> Thanks,
>
> This syntax works but it is not documented... it looks like a short hand for 
> a subquery, interesting !.

Join sources are like sub-queries.  Look at the syntax.

A sub-select specified in the join-source following the FROM clause in
a simple SELECT statement is handled as if it was a table containing
the data returned by executing the sub-select statement.  The docs
could perhaps be clearer about this.  This sentence:

"A sub-select specified in the join-source following the FROM clause
in a simple SELECT statement is handled as if it was a table
containing the data returned by executing the sub-select statement. "

does hint at this, but maybe that's just because I think of
 as a sort of sub-select.

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread TAUZIN Mathieu
Thanks,

This syntax works but it is not documented... it looks like a short hand for a 
subquery, interesting !.

Mathieu

-Message d'origine-
De : sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
De la part de Simon Davies
Envoyé : lundi 19 mars 2012 14:24
À : General Discussion of SQLite Database
Objet : Re: [sqlite] Join-source issue when using sub '(' join-source ')'

On 19 March 2012 13:05, Pavel Ivanov <paiva...@gmail.com> wrote:
>> According to the documentation on SELECT statements 
>> http://www.sqlite.org/lang_select.html
>> It seems possible to write join chains as A join (B join C). (using a 
>> '(' join-source ')' single-source )
> ...
>> It seems that parsing is ok (no syntax error) but sources in the sub join 
>> can't be used outside the parenthesis.
>>
>> Could you confirm this is a bug ? or did I miss something ?
>
> So there's no problem in documentation and you can indeed write join 
> chains using parenthesis. Also there's no bug in SQLite because 
> (according to documentation) join-source with parenthesis is 
> considered a single-source and as a consequence you can't link to some 
> details of this single-source outside of parenthesis. Maybe MS SQL 
> Server and Oracle provide an extension to do that, but SQLite doesn't.
>
> So we can't confirm a bug and you didn't miss anything. You can file a 
> bug with the application generating your original query.
>
>
> Pavel

Maybe what the application should be doing:

SELECT subjoin.OrderID
FROM Customers
INNER JOIN
(Orders
LEFT OUTER JOIN InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID
) AS subjoin
ON Customers.CustomerID = subjoin.CustomerID WHERE 'ALFKI' = 
Customers.CustomerID

Regards,
Simon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Simon Davies
On 19 March 2012 13:05, Pavel Ivanov  wrote:
>> According to the documentation on SELECT statements 
>> http://www.sqlite.org/lang_select.html
>> It seems possible to write join chains as A join (B join C). (using a '(' 
>> join-source ')' single-source )
> ...
>> It seems that parsing is ok (no syntax error) but sources in the sub join 
>> can't be used outside the parenthesis.
>>
>> Could you confirm this is a bug ? or did I miss something ?
>
> So there's no problem in documentation and you can indeed write join
> chains using parenthesis. Also there's no bug in SQLite because
> (according to documentation) join-source with parenthesis is
> considered a single-source and as a consequence you can't link to some
> details of this single-source outside of parenthesis. Maybe MS SQL
> Server and Oracle provide an extension to do that, but SQLite doesn't.
>
> So we can't confirm a bug and you didn't miss anything. You can file a
> bug with the application generating your original query.
>
>
> Pavel

Maybe what the application should be doing:

SELECT subjoin.OrderID
FROM Customers
INNER JOIN
(Orders
LEFT OUTER JOIN InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID
) AS subjoin
ON Customers.CustomerID = subjoin.CustomerID
WHERE 'ALFKI' = Customers.CustomerID

Regards,
Simon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Simon Slavin

On 19 Mar 2012, at 12:51pm, TAUZIN Mathieu  wrote:

> Thanks for your response but my intent was to give rise to either an bug on 
> SQLite engine or an error (or maybe lack of precision) in the documentation.

SQLite is fine.  The documentation is accurate about what SQLite does.

> The sample I gave is a simplified version of the real query which is built by 
> a tool so I have not the choice on the form.

If your tool is intended for SQLite, and builds this form for SQLite, then 
there is a bug in your tool.  You can submit this bug with the authors of that 
tool.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Pavel Ivanov
> According to the documentation on SELECT statements 
> http://www.sqlite.org/lang_select.html
> It seems possible to write join chains as A join (B join C). (using a '(' 
> join-source ')' single-source )
...
> It seems that parsing is ok (no syntax error) but sources in the sub join 
> can't be used outside the parenthesis.
>
> Could you confirm this is a bug ? or did I miss something ?

So there's no problem in documentation and you can indeed write join
chains using parenthesis. Also there's no bug in SQLite because
(according to documentation) join-source with parenthesis is
considered a single-source and as a consequence you can't link to some
details of this single-source outside of parenthesis. Maybe MS SQL
Server and Oracle provide an extension to do that, but SQLite doesn't.

So we can't confirm a bug and you didn't miss anything. You can file a
bug with the application generating your original query.


Pavel


On Mon, Mar 19, 2012 at 8:03 AM, TAUZIN Mathieu  wrote:
> Hi,
>
> According to the documentation on SELECT statements 
> http://www.sqlite.org/lang_select.html
> It seems possible to write join chains as A join (B join C). (using a '(' 
> join-source ')' single-source )
>
> But on the well known NorthwindEF database this query ...
>
> SELECT Orders.OrderID
> FROM Customers
> INNER JOIN
> (Orders
> LEFT OUTER JOIN InternationalOrders
> ON Orders.OrderID = InternationalOrders.OrderID
> )
> ON Customers.CustomerID = Orders.CustomerID
> WHERE 'ALFKI' = Customers.CustomerID
>
> ... raises an error :
> no such column: Orders.OrderID
>
> It seems that parsing is ok (no syntax error) but sources in the sub join 
> can't be used outside the parenthesis.
>
> Could you confirm this is a bug ? or did I miss something ?
>
> I tried this query on other DB engines (SqlServer and Oracle) and it works 
> fine (producing the same execution plan than the equivalent queries below).
>
> I know I could rewrite my example with a sub query  ...
> SELECT      Useless.OrderID
> FROM Customers
>      INNER JOIN (
>            SELECT
>                  Orders.OrderID as OrderID,
>                  Orders.CustomerID as CustomerID
>            FROM Orders LEFT OUTER JOIN InternationalOrders ON Orders.OrderID 
> = InternationalOrders.OrderID
>      ) AS Useless
>      ON Customers.CustomerID = Useless.CustomerID
> WHERE 'ALFKI' = Customers.CustomerID
>
> Or without subjoin...
> SELECT      Orders.OrderID
> FROM Customers
>      INNER JOIN Orders
>            ON Customers.CustomerID = Orders.CustomerID
>      LEFT OUTER JOIN InternationalOrders
>            ON Orders.OrderID = InternationalOrders.OrderID
> WHERE 'ALFKI' = Customers.CustomerID
>
> But it illustrates the issue.
>
> Thanks,
>
> Mathieu TAUZIN
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread TAUZIN Mathieu
Thanks for your response but my intent was to give rise to either an bug on 
SQLite engine or an error (or maybe lack of precision) in the documentation.

The sample I gave is a simplified version of the real query which is built by a 
tool so I have not the choice on the form.

Mathieu.

-Message d'origine-
De : sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
De la part de Simon Slavin
Envoyé : lundi 19 mars 2012 13:20
À : General Discussion of SQLite Database
Objet : Re: [sqlite] Join-source issue when using sub '(' join-source ')'


On 19 Mar 2012, at 12:03pm, TAUZIN Mathieu <mtau...@cegid.fr> wrote:

> Or without subjoin...
> SELECT  Orders.OrderID
> FROM Customers
>  INNER JOIN Orders
>ON Customers.CustomerID = Orders.CustomerID
>  LEFT OUTER JOIN InternationalOrders
>ON Orders.OrderID = InternationalOrders.OrderID WHERE 
> 'ALFKI' = Customers.CustomerID

This form should work fine in SQLite and is the expected way to do JOINs to 
many files.  It is just as efficient as subjoins in SQLite: the engine ends up 
doing no more work.  And it has the advantage that it works in many different 
versions of SQL since it's part of the SQL definition.  Unless you have a 
particular reason for wanting to avoid this form, I'd say go ahead and use it.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

>>Original message :
>>
>>Hi,
>>
>>According to the documentation on SELECT statements 
>>http://www.sqlite.org/lang_select.html
>>It seems possible to write join chains as A join (B join C). (using a '(' 
>>join-source ')' single-source )
>>
>>But on the well known NorthwindEF database this query ...
>>
>>SELECT Orders.OrderID
>>FROM Customers
>>INNER JOIN
>>(Orders
>>LEFT OUTER JOIN InternationalOrders
>>ON Orders.OrderID = InternationalOrders.OrderID
>>)
>>ON Customers.CustomerID = Orders.CustomerID WHERE 'ALFKI' = 
>>Customers.CustomerID
>>
>>... raises an error :
>>no such column: Orders.OrderID
>>
>>It seems that parsing is ok (no syntax error) but sources in the sub join 
>>can't be used outside the parenthesis.
>>
>>Could you confirm this is a bug ? or did I miss something ?
>>
>>I tried this query on other DB engines (SqlServer and Oracle) and it works 
>>fine (producing the same execution plan than the equivalent queries below).
>>
>>I know I could rewrite my example with a sub query  ...
>>SELECT  Useless.OrderID
>>FROM Customers
>>  INNER JOIN (
>>SELECT
>>  Orders.OrderID as OrderID,
>>  Orders.CustomerID as CustomerID
>>FROM Orders LEFT OUTER JOIN InternationalOrders ON Orders.OrderID 
>> = InternationalOrders.OrderID
>>  ) AS Useless
>>  ON Customers.CustomerID = Useless.CustomerID WHERE 'ALFKI' = 
>> Customers.CustomerID
>>
>>Or without subjoin...
>>SELECT  Orders.OrderID
>>FROM Customers
>>  INNER JOIN Orders
>>ON Customers.CustomerID = Orders.CustomerID
>>  LEFT OUTER JOIN InternationalOrders
>>ON Orders.OrderID = InternationalOrders.OrderID WHERE 'ALFKI' = 
>> Customers.CustomerID
>>
>>But it illustrates the issue.
>>
>>Thanks,
>>
>>Mathieu TAUZIN
>>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread Simon Slavin

On 19 Mar 2012, at 12:03pm, TAUZIN Mathieu  wrote:

> Or without subjoin...
> SELECT  Orders.OrderID
> FROM Customers
>  INNER JOIN Orders
>ON Customers.CustomerID = Orders.CustomerID
>  LEFT OUTER JOIN InternationalOrders
>ON Orders.OrderID = InternationalOrders.OrderID
> WHERE 'ALFKI' = Customers.CustomerID

This form should work fine in SQLite and is the expected way to do JOINs to 
many files.  It is just as efficient as subjoins in SQLite: the engine ends up 
doing no more work.  And it has the advantage that it works in many different 
versions of SQL since it's part of the SQL definition.  Unless you have a 
particular reason for wanting to avoid this form, I'd say go ahead and use it.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join-source issue when using sub '(' join-source ')'

2012-03-19 Thread TAUZIN Mathieu
Hi,

According to the documentation on SELECT statements 
http://www.sqlite.org/lang_select.html
It seems possible to write join chains as A join (B join C). (using a '(' 
join-source ')' single-source )

But on the well known NorthwindEF database this query ...

SELECT Orders.OrderID
FROM Customers
INNER JOIN
(Orders
LEFT OUTER JOIN InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID
)
ON Customers.CustomerID = Orders.CustomerID
WHERE 'ALFKI' = Customers.CustomerID

... raises an error :
no such column: Orders.OrderID

It seems that parsing is ok (no syntax error) but sources in the sub join can't 
be used outside the parenthesis.

Could you confirm this is a bug ? or did I miss something ?

I tried this query on other DB engines (SqlServer and Oracle) and it works fine 
(producing the same execution plan than the equivalent queries below).

I know I could rewrite my example with a sub query  ...
SELECT  Useless.OrderID
FROM Customers
  INNER JOIN (
SELECT
  Orders.OrderID as OrderID,
  Orders.CustomerID as CustomerID
FROM Orders LEFT OUTER JOIN InternationalOrders ON Orders.OrderID = 
InternationalOrders.OrderID
  ) AS Useless
  ON Customers.CustomerID = Useless.CustomerID
WHERE 'ALFKI' = Customers.CustomerID

Or without subjoin...
SELECT  Orders.OrderID
FROM Customers
  INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID
  LEFT OUTER JOIN InternationalOrders
ON Orders.OrderID = InternationalOrders.OrderID
WHERE 'ALFKI' = Customers.CustomerID

But it illustrates the issue.

Thanks,

Mathieu TAUZIN

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN on between

2012-01-11 Thread Igor Tandetnik

On 1/11/2012 3:49 PM, Pawl wrote:

select errorapi.*,login.* from errorapi JOIN login on (errorapi.start
between login.start and login.ende)

This command show only one end record. It is possible to use JOINS?


I don't see anything wrong with the query. If it only reports one 
record, that must be because there's only one pair of records (one from 
errorapi and the other from login) that satisfies the condition. Check 
your data; in particular, verify that login.ende is populated correctly.

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN on between

2012-01-11 Thread Pawl

Hi,

I need to join two table according to range of date. 

Fist table is error log table second is users{operators} login. 
I need to show complete table with all error with actual user id.

CREATE TABLE [login] (
[id] iNTEGER  NOT NULL,
[start] TIMESTAMP DEFAULT (datetime('now','localtime')) NOT NULL,
[ende] TIMESTAMP  NULL,
FOREIGN KEY(id) REFERENCES operators(id)
)

CREATE TABLE [errorapi] (
[start] TIMESTAMP DEFAULT (datetime('now','localtime')) NOT NULL,
[konec] TIMESTAMP,
[describe] TEXT  NOT NULL,
[track] TEXT,
[account] TEXT,
[priority] INTEGER
)

select errorapi.*,login.* from errorapi JOIN login on (errorapi.start
between login.start and login.ende)

This command show only one end record. It is possible to use JOINS? 

I don't want to add operator_id as columb, because I have more tables when I
want to show this.

Lot of thanks, 

Pavel Samek

-- 
View this message in context: 
http://old.nabble.com/JOIN-on-between-tp33124032p33124032.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs IN

2011-11-11 Thread Pavel Ivanov
On Fri, Nov 11, 2011 at 8:06 PM, Petite Abeille
 wrote:
>> No, "exists" in this case will change query plan significantly and
>> performance can degrade drastically as a result.
>
> Why would that be?

How would you rewrite the query using exists? The only thing I have in mind is

SELECT * FROM table2
WHERE exists (
SELECT 1 FROM table1
WHERE amount > 500
AND table1.rowid = table2.rowid)

And this query will force SQLite to use the second query plan I talked
about: scan full table2 and for each row search in table1 for rows
with the same rowid and check if it has necessary amount. And this
plan will very likely be slower.


Pavel
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs IN

2011-11-11 Thread Darren Duncan

Fabian wrote:

Suppose I have two tables, and I want to have look for a value in the first
table, but display the columns from the second table. The most obvious way
would be joining them on rowid. But I don't need to SELECT any columns from
the first table, and it's a FTS4 table (which always joins a bit slower
than real tables), so I currently do:

SELECT * FROM table2 WHERE rowid IN (SELECT rowid FROM table1 WHERE amount

500)


It returns the same results, but it doesn't seem much faster. Is there any
performance difference to be expected from using IN instead of JOIN, or
does SQLite internally rewrite JOIN queries to something similar as IN,
which would explain they perform nearly the same?


The logical operation you are doing is a semijoin, filtering table2 by matching 
rows in table1 (if you used NOT IN instead you would be doing an antijoin).


A semijoin is most clearly expressed in SQL using the WHERE clause as you did, 
because the only purpose of table1 is to filter and not to return values from, 
as putting it in FROM would imply.


Now because SQL is bag oriented rather than set oriented, using IN also helps 
because you avoid generating extra duplicates, whereas if you used the join 
method instead, then if any row in one table matched multiple rows in the other 
(because you weren't joining on a (unique) key of both tables), the result could 
have duplicate table2 rows, which probably isn't what you want.


As to your performance question, any good DBMS should make both of your methods 
perform about the same, but that if they aren't the same, the IN version should 
always perform faster than the FROM version because with IN you only ever have 
to look at each row in table2 once; as soon as it finds any match you move on, 
rather than repeating for all possible matches.


Note that semijoins and antijoins are what you have both when you have another 
select after the NOT/IN and when you have a literal list, such as "IN (1,2,3)".


Note that any WHERE clause that consists just of ANDed equality tests, such as 
the common "WHERE foo = 3" is also a trivial case of a semijoin where the table 
you are filtering on has exactly 1 row whose field value is 3, and ostensibly 
such WHERE clauses should also be optimizable.


-- Darren Duncan
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs IN

2011-11-11 Thread Petite Abeille

On Nov 12, 2011, at 1:58 AM, Pavel Ivanov wrote:

> No, "exists" in this case will change query plan significantly and
> performance can degrade drastically as a result.

Why would that be?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs IN

2011-11-11 Thread Pavel Ivanov
On Fri, Nov 11, 2011 at 11:58 AM, Petite Abeille
 wrote:
>> It returns the same results, but it doesn't seem much faster. Is there any
>> performance difference to be expected from using IN instead of JOIN, or
>> does SQLite internally rewrite JOIN queries to something similar as IN,
>> which would explain they perform nearly the same?
>
> They should be equivalent in terms of cost. That said, you might want to use 
> the 'exists'  clause instead for the sake of clarity.

No, "exists" in this case will change query plan significantly and
performance can degrade drastically as a result.

For the original question: it's not that SQLite rewrites JOIN queries
to be as IN. It's just in your particular case both queries can be
executed in the same way: find all rows in table1 with the necessary
amount, for each row look into table2 and find rows with the same
rowid. Query with JOIN however could be executed differently - for
each row in table2 find all rows with the same rowid in table1 and
then check amount in them. SQLite decided that this query plan will be
less effecient.


Pavel
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN vs IN

2011-11-11 Thread Fabian
Suppose I have two tables, and I want to have look for a value in the first
table, but display the columns from the second table. The most obvious way
would be joining them on rowid. But I don't need to SELECT any columns from
the first table, and it's a FTS4 table (which always joins a bit slower
than real tables), so I currently do:

SELECT * FROM table2 WHERE rowid IN (SELECT rowid FROM table1 WHERE amount
> 500)

It returns the same results, but it doesn't seem much faster. Is there any
performance difference to be expected from using IN instead of JOIN, or
does SQLite internally rewrite JOIN queries to something similar as IN,
which would explain they perform nearly the same?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join execution doubt

2011-05-07 Thread Lucas Cotta
Hi!

I'm studying the join execution...

suppose a join with two tables without indexes..

is there any case at all where join will do a complete inner loop for each
line in the outer loop?
Because it seems it always build a temp table to autoindex the inner
table..

thanks!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-18 Thread Nicolas Williams
On Tue, Jan 18, 2011 at 10:13:10PM +0100, Florian Weimer wrote:
> * Richard Hipp:
> 
> > I don't think it makes sense in SQL (not just SQLite but SQL in
> > general) for an aggregate query to return columns that are not in
> > the GROUP BY clause.
> 
> Isn't this just what PostgreSQL implements as DISTINCT ON?  Then it
> *is* useful.

DISTINCT and DISTINCT ON effectively map to GROUP BY, no?  DISTINCT ON
explicitly results in unpredictable results unless an ORDER BY clause is
also used.

Incidentally, if there is a UNIQUE constraint for the {grouping columns}
_and_ the non-aggregated/non-grouped columns, then such a query also
makes sense.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-18 Thread Florian Weimer
* Richard Hipp:

> I don't think it makes sense in SQL (not just SQLite but SQL in
> general) for an aggregate query to return columns that are not in
> the GROUP BY clause.

Isn't this just what PostgreSQL implements as DISTINCT ON?  Then it
*is* useful.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread David Burström
Dan, Richard, Igor,

thanks for your input, and yes, it seems as if the gamble is no longer 
safe. Hopefully I'm the only one that has run into this side effect ;)

:-David

On 01/17/2011 04:57 PM, Igor Tandetnik wrote:
> David Burström  wrote:
>> SELECT starttime, endtime from entry LEFT JOIN interval ON
>> interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
>> MAX(starttime);
> The behavior of this statement is unspecified. In standard SQL, it is 
> syntactically invalid - in a GROUP BY statement, all columns that appear in 
> SELECT or HAVING clauses must be either mentioned in GROUP BY, or appear only 
> in expressions that are arguments of aggregate functions.
>
> As an extension, SQLite allows ungrouped columns in these contexts. The value 
> of such an expression is the column value taken from some random, unspecified 
> row belonging to the group.
>
> Thus, your condition of "HAVING starttime = MAX(starttime)" is a gamble. You 
> are saying, pick some random row from the group, and if that row just happens 
> to contain the largest value across the group, then include the group in the 
> resultset, otherwise throw it away. With 3.6.22, you got lucky and won the 
> gamble. With 3.7.2, the implementation details have changed so that a 
> different row is now being picked, and you are losing the gamble (and blaming 
> the roulette wheel).

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread Igor Tandetnik
David Burström  wrote:
> SELECT starttime, endtime from entry LEFT JOIN interval ON
> interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
> MAX(starttime);

The behavior of this statement is unspecified. In standard SQL, it is 
syntactically invalid - in a GROUP BY statement, all columns that appear in 
SELECT or HAVING clauses must be either mentioned in GROUP BY, or appear only 
in expressions that are arguments of aggregate functions.

As an extension, SQLite allows ungrouped columns in these contexts. The value 
of such an expression is the column value taken from some random, unspecified 
row belonging to the group.

Thus, your condition of "HAVING starttime = MAX(starttime)" is a gamble. You 
are saying, pick some random row from the group, and if that row just happens 
to contain the largest value across the group, then include the group in the 
resultset, otherwise throw it away. With 3.6.22, you got lucky and won the 
gamble. With 3.7.2, the implementation details have changed so that a different 
row is now being picked, and you are losing the gamble (and blaming the 
roulette wheel).
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread Black, Michael (IS)
Hmmm...my initial testing on 3.7.4 worked...so here's how to reproduce.  
Analyze fixes it.
Why does the first entry show 1,000,000 rows???
 
It appears the automatic index isn't working in this case
 
Before analyze
0|0|0|SCAN TABLE entry USING INTEGER PRIMARY KEY (~100 rows)
0|1|1|SEARCH TABLE interval USING AUTOMATIC COVERING INDEX (entryid=?) (~7 rows)
After analyze
3|1
0|0|0|SCAN TABLE entry USING INTEGER PRIMARY KEY (~1 rows)
0|1|1|SCAN TABLE interval (~2 rows)
 
CREATE TABLE interval (endtime INTEGER, entryid INTEGER, starttime INTEGER);
INSERT INTO "interval" VALUES(2,42,1);
INSERT INTO "interval" VALUES(1,42,3);
CREATE TABLE entry (id INTEGER PRIMARY KEY AUTOINCREMENT);
INSERT INTO "entry" VALUES(42);
DELETE FROM sqlite_sequence;
INSERT INTO "sqlite_sequence" VALUES('entry',42);
select 'Before analyze';
SELECT starttime, endtime from entry LEFT JOIN interval ON
interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
MAX(starttime);
explain query plan SELECT starttime, endtime from entry LEFT JOIN interval ON
interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
MAX(starttime);
select 'After analyze';
analyze;
SELECT starttime, endtime from entry LEFT JOIN interval ON
interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
MAX(starttime);
explain query plan SELECT starttime, endtime from entry LEFT JOIN interval ON
interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
MAX(starttime);

 
Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems
 



From: sqlite-users-boun...@sqlite.org on behalf of David Burström
Sent: Mon 1/17/2011 9:37 AM
To: General Discussion of SQLite Database
Subject: EXTERNAL:Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22



I just tried the 3.7.4 binary on Linux, and the bug is still around.

:-David

On 01/17/2011 04:25 PM, Black, Michael (IS) wrote:
> This may be the patch that fixes your problem...
> http://www.sqlite.org/src/info/ece641eb89
>
> Was released in 3.7.3
>
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Northrop Grumman Information Systems
>
>
> 
>
> From: sqlite-users-boun...@sqlite.org on behalf of David Burström
> Sent: Mon 1/17/2011 6:47 AM
> To: sqlite-users@sqlite.org
> Subject: EXTERNAL:[sqlite] JOIN bug in 3.7.2, not in 3.6.22
>
>
>
> Hello all!
>
> I stumbled across this strange bug during Android development on 2.2.1
> late last night. Please run the following snippet in SQLite 3.7.2 and
> 3.6.22 to compare the differences. The comments shows what alterations
> you can make to make the query return the expected result.
>
>
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread Richard Hipp
On Mon, Jan 17, 2011 at 7:47 AM, David Burström wrote:

> Hello all!
>
> I stumbled across this strange bug during Android development on 2.2.1
> late last night. Please run the following snippet in SQLite 3.7.2 and
> 3.6.22 to compare the differences. The comments shows what alterations
> you can make to make the query return the expected result.
>


> SELECT starttime, endtime from entry LEFT JOIN interval ON
> interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
> MAX(starttime);
>

I don't think it makes sense in SQL (not just SQLite but SQL in general) for
an aggregate query to return columns that are not in the GROUP BY clause.
Certainly the values returned for those columns are undefined.  So I'm not
sure the query above makes any sense.  If it worked for you before, your
were  just lucky.

Perhaps something like this will work better for you:

SELECT
   entry.id,
   max(starttime),
   (SELECT endtime FROM interval  WHERE interval.entryid=entry.id)
FROM entry LEFT JOIN interval ON interval.entryid=entry.id
GROUP BY entry.id;




>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread Dan Kennedy


> -- if endtime is in a different position in the table, the query works
> CREATE TABLE interval (endtime INTEGER, entryid INTEGER, starttime INTEGER);
> CREATE TABLE entry (id INTEGER PRIMARY KEY AUTOINCREMENT);
>
> INSERT INTO entry (id) VALUES ( 42);
>
> INSERT INTO interval (endtime, entryid, starttime) VALUES (2, 42, 1);
> -- if the endtime below is greater than or equal to the endtime above,
> the query works
> INSERT INTO interval (endtime, entryid, starttime) VALUES (1, 42, 3);
> -- if endtime is removed from the projection, the query works
> -- if the LEFT JOIN is changed to INNER JOIN, the query works

> SELECT starttime, endtime from entry LEFT JOIN interval ON
> interval.entryid = entry.id GROUP BY entry.id HAVING starttime =
> MAX(starttime);

Strictly speaking, the result of the HAVING expression is undefined
here. As "starttime" is not an aggregate or a part of the GROUP BY
clause.

Dan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread David Burström
I just tried the 3.7.4 binary on Linux, and the bug is still around.

:-David

On 01/17/2011 04:25 PM, Black, Michael (IS) wrote:
> This may be the patch that fixes your problem...
> http://www.sqlite.org/src/info/ece641eb89
>
> Was released in 3.7.3
>
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Northrop Grumman Information Systems
>
>
> 
>
> From: sqlite-users-boun...@sqlite.org on behalf of David Burström
> Sent: Mon 1/17/2011 6:47 AM
> To: sqlite-users@sqlite.org
> Subject: EXTERNAL:[sqlite] JOIN bug in 3.7.2, not in 3.6.22
>
>
>
> Hello all!
>
> I stumbled across this strange bug during Android development on 2.2.1
> late last night. Please run the following snippet in SQLite 3.7.2 and
> 3.6.22 to compare the differences. The comments shows what alterations
> you can make to make the query return the expected result.
>
>
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread Black, Michael (IS)
This may be the patch that fixes your problem...
http://www.sqlite.org/src/info/ece641eb89
 
Was released in 3.7.3
 
Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems
 



From: sqlite-users-boun...@sqlite.org on behalf of David Burström
Sent: Mon 1/17/2011 6:47 AM
To: sqlite-users@sqlite.org
Subject: EXTERNAL:[sqlite] JOIN bug in 3.7.2, not in 3.6.22



Hello all!

I stumbled across this strange bug during Android development on 2.2.1
late last night. Please run the following snippet in SQLite 3.7.2 and
3.6.22 to compare the differences. The comments shows what alterations
you can make to make the query return the expected result.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN bug in 3.7.2, not in 3.6.22

2011-01-17 Thread David Burström
Hello all!

I stumbled across this strange bug during Android development on 2.2.1 
late last night. Please run the following snippet in SQLite 3.7.2 and 
3.6.22 to compare the differences. The comments shows what alterations 
you can make to make the query return the expected result.

:-David Burström



.header on
.mode column
.echo on

--
-- The query below is supposed to return one row, starttime = 3 and 
endtime = 1. In SQLite 3.622 it works. In SQLite 3.7.2 it returns 0 rows.
--

-- if endtime is in a different position in the table, the query works
CREATE TABLE interval (endtime INTEGER, entryid INTEGER, starttime INTEGER);
CREATE TABLE entry (id INTEGER PRIMARY KEY AUTOINCREMENT);

INSERT INTO entry (id) VALUES ( 42);

INSERT INTO interval (endtime, entryid, starttime) VALUES (2, 42, 1);
-- if the endtime below is greater than or equal to the endtime above, 
the query works
INSERT INTO interval (endtime, entryid, starttime) VALUES (1, 42, 3);
-- if endtime is removed from the projection, the query works
-- if the LEFT JOIN is changed to INNER JOIN, the query works
SELECT starttime, endtime from entry LEFT JOIN interval ON 
interval.entryid = entry.id GROUP BY entry.id HAVING starttime = 
MAX(starttime);

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] join performance query

2010-05-11 Thread Tim Romano
Let's try that again : expose the [number] column to the outer selection (**
are for emphasis**):


( select  id_song, **number** from


 (
 select  id_song, **number**
 from PLAYLIST_SONG
 where id_playlist=2
{and|or }  number > 258
) as MYPLAYLISTSONGS

Regards
Tim Romano

On Tue, May 11, 2010 at 12:46 PM, Tim Romano wrote:

> And you would put move your title-condition to the outer query:
>
> .
> .
> .
>  ) as SONGIDLIST
>  on SONG.id_song = SONGIDLIST.id_song
>
> where
> your title-condition and|or your  title-number condition
>
>
> Regards
> Tim Romano
>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] join performance query

2010-05-11 Thread Tim Romano
And you would put move your title-condition to the outer query:

.
.
.
 ) as SONGIDLIST
 on SONG.id_song = SONGIDLIST.id_song

where
your title-condition and|or your  title-number condition


Regards
Tim Romano
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] join performance query

2010-05-11 Thread Tim Romano
Arrrgh, Google Chrome ate the top half of my reply.

You must also expose the number column in the inner query against
PLAYLIST_SONG; include your number-condition there and also specify the
number column in the select-list:

( select  id_song, number from

 (
 select  id_song from PLAYLIST_SONG
 where id_playlist=2
{AND|OR }  number > 258
) as MYPLAYLISTSONGS

Regards
Tim Romano



On Tue, May 11, 2010 at 12:39 PM, Tim Romano wrote:

> You could remove the title condition from the inner SONGS select, limiting
> your conditions to artist and genre; an index on column [artist] would make
> this subquery run quickly:
>
>
>  (
>  select id_song from
>  SONG
>  where genre_id = 0 AND artist = 'Las ketchup'
> //  AND title >= 'Asereje(karaoke version)'// --> moved to outer select
> > ) as MYSONGS
>
>
>
> The goal is to produce small inner subsets using indexes, and then to join
> these with each other, and to let the inner subsets expose the necessary
> columns to the outer query.
>
> Regards
> Tim Romano
>
>
>
> On Tue, May 11, 2010 at 11:13 AM, Andrea Galeazzi wrote:
>
>> Sorry but in your solution, how can I solve the condition
>> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
>> >> version)' OR number > 258)
>> ?
>> title is on song and number is song_number on Playlist_Song AS PS.
>> Furthermore I also need title and number in place of your  select * from
>> SONG
>> Could you write it again please?
>> Thanks
>>
>> Citando Tim Romano :
>>
>> > 1. Try discrete single-column indexes rather than multi-column composite
>> > indexes.
>> > 2. Try  breaking the query down into subsets expressed as parenthetical
>> > queries; you can treat these parenthetical queries as if they were
>> tables by
>> > assigning them an alias, and then you can join against the aliases.   I
>> have
>> > sped queries up in SQLite using this approach and, with a little
>> tinkering,
>> > the time can drop from over a minute to sub-second.   Performance will
>> > depend on the indexes and criteria used, of course. But this approach
>> lets
>> > you see how SQLite is optimizing the creation of the component sets from
>> > which you can build up your ultimate query.
>> > .
>> > select * from SONG
>> > JOIN
>> >
>> > ( select  id_song from
>> >
>> > (
>> > select  id_song from PLAYLIST_SONG
>> > where id_playlist=2
>> > ) as MYPLAYLISTSONGS
>> >
>> > JOIN
>> >
>> > (
>> > select id_song from
>> > SONG
>> > where genre_id = 0 AND artist = 'Las ketchup'
>> > AND title >= 'Asereje(karaoke version)'
>> > ) as MYSONGS
>> >
>> > on MYSONGS.id_song = MYPLAYLISTSONGS.id_song
>> >
>> >
>> > ) as SONGIDLIST
>> >
>> > on SONG.id_song = SONGIDLIST.id_song
>> >
>> >
>> > Regards
>> > Tim Romano
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Tue, May 11, 2010 at 6:07 AM, Andrea Galeazzi 
>> wrote:
>> >
>> >> Hi guys,
>> >> I'm in a bind for a huge time consuming query!
>> >> I made the following database schema:
>> >>
>> >> CREATE TABLE Song (
>> >>idINTEGER NOT NULL UNIQUE,
>> >>titleVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
>> >>artistVARCHAR(40) NOT NULL DEFAULT '' COLLATE
>> NOCASE,
>> >>genre_idINT NOT NULL DEFAULT 0,
>> >> PRIMARY KEY (id),
>> >>
>> >> CONSTRAINT fk_Genre FOREIGN KEY (genre_id)
>> >>REFERENCES Genre (id)
>> >>ON DELETE SET DEFAULT
>> >>ON UPDATE CASCADE);
>> >>
>> >> CREATE INDEX Song_Browse_View_idx ON Song(genre_id,artist,title);
>> >>
>> >> CREATE TABLE PlayList (
>> >>id INTEGER NOT NULL UNIQUE,
>> >>name   VARCHAR(15) NOT NULL COLLATE NOCASE, --KORGCOLLATE,
>> >>length INT NOT NULL DEFAULT 0,
>> >>created_date   TEXT,
>> >> PRIMARY KEY (id));
>> >>
>> >> CREATE TABLE PlayList_Song (
>> >>id_song INT NOT NULL,
>> >>id_playlist INT NOT NULL,
>> >>song_number INTEGER NOT NULL,
>> >> PRIMARY KEY (id_playlist, song_number),
>> >> CONSTRAINT fk_PlayList_Song1 FOREIGN KEY (id_song)
>> >>REFERENCES Song (id)
>> >>ON DELETE CASCADE
>> >>ON UPDATE CASCADE,
>> >> CONSTRAINT fk_PlayList_Song2 FOREIGN KEY (id_playlist)
>> >>REFERENCES PlayList (id)
>> >>ON DELETE CASCADE
>> >>ON UPDATE CASCADE);
>> >>
>> >> CREATE INDEX PlayList_Song_song_number_idx ON
>> PlayList_Song(song_number);
>> >>
>> >> Now I need to scroll title filtered by genre_id and artist both in Song
>> >> table and Playlist.
>> >> The query for the first case is very fast:
>> >> SELECT id AS number,title  FROM Song WHERE genre_id = 0 AND artist =
>> >> 'Las ketchup'
>> >> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
>> >> version)' OR number > 258)
>> >> ORDER BY title ASC , number ASC LIMIT 4;
>> >>
>> >> The second case is about 35 times slower... so the scrolling is quite
>> >> impossible (or useless)!
>> >> SELECT song_number AS number,title  FROM Song 

Re: [sqlite] join performance query

2010-05-11 Thread Tim Romano
You could remove the title condition from the inner SONGS select, limiting
your conditions to artist and genre; an index on column [artist] would make
this subquery run quickly:


 (
 select id_song from
 SONG
 where genre_id = 0 AND artist = 'Las ketchup'
//  AND title >= 'Asereje(karaoke version)'// --> moved to outer select
> ) as MYSONGS



The goal is to produce small inner subsets using indexes, and then to join
these with each other, and to let the inner subsets expose the necessary
columns to the outer query.

Regards
Tim Romano


On Tue, May 11, 2010 at 11:13 AM, Andrea Galeazzi  wrote:

> Sorry but in your solution, how can I solve the condition
> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
> >> version)' OR number > 258)
> ?
> title is on song and number is song_number on Playlist_Song AS PS.
> Furthermore I also need title and number in place of your  select * from
> SONG
> Could you write it again please?
> Thanks
>
> Citando Tim Romano :
>
> > 1. Try discrete single-column indexes rather than multi-column composite
> > indexes.
> > 2. Try  breaking the query down into subsets expressed as parenthetical
> > queries; you can treat these parenthetical queries as if they were tables
> by
> > assigning them an alias, and then you can join against the aliases.   I
> have
> > sped queries up in SQLite using this approach and, with a little
> tinkering,
> > the time can drop from over a minute to sub-second.   Performance will
> > depend on the indexes and criteria used, of course. But this approach
> lets
> > you see how SQLite is optimizing the creation of the component sets from
> > which you can build up your ultimate query.
> > .
> > select * from SONG
> > JOIN
> >
> > ( select  id_song from
> >
> > (
> > select  id_song from PLAYLIST_SONG
> > where id_playlist=2
> > ) as MYPLAYLISTSONGS
> >
> > JOIN
> >
> > (
> > select id_song from
> > SONG
> > where genre_id = 0 AND artist = 'Las ketchup'
> > AND title >= 'Asereje(karaoke version)'
> > ) as MYSONGS
> >
> > on MYSONGS.id_song = MYPLAYLISTSONGS.id_song
> >
> >
> > ) as SONGIDLIST
> >
> > on SONG.id_song = SONGIDLIST.id_song
> >
> >
> > Regards
> > Tim Romano
> >
> >
> >
> >
> >
> >
> > On Tue, May 11, 2010 at 6:07 AM, Andrea Galeazzi 
> wrote:
> >
> >> Hi guys,
> >> I'm in a bind for a huge time consuming query!
> >> I made the following database schema:
> >>
> >> CREATE TABLE Song (
> >>idINTEGER NOT NULL UNIQUE,
> >>titleVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
> >>artistVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
> >>genre_idINT NOT NULL DEFAULT 0,
> >> PRIMARY KEY (id),
> >>
> >> CONSTRAINT fk_Genre FOREIGN KEY (genre_id)
> >>REFERENCES Genre (id)
> >>ON DELETE SET DEFAULT
> >>ON UPDATE CASCADE);
> >>
> >> CREATE INDEX Song_Browse_View_idx ON Song(genre_id,artist,title);
> >>
> >> CREATE TABLE PlayList (
> >>id INTEGER NOT NULL UNIQUE,
> >>name   VARCHAR(15) NOT NULL COLLATE NOCASE, --KORGCOLLATE,
> >>length INT NOT NULL DEFAULT 0,
> >>created_date   TEXT,
> >> PRIMARY KEY (id));
> >>
> >> CREATE TABLE PlayList_Song (
> >>id_song INT NOT NULL,
> >>id_playlist INT NOT NULL,
> >>song_number INTEGER NOT NULL,
> >> PRIMARY KEY (id_playlist, song_number),
> >> CONSTRAINT fk_PlayList_Song1 FOREIGN KEY (id_song)
> >>REFERENCES Song (id)
> >>ON DELETE CASCADE
> >>ON UPDATE CASCADE,
> >> CONSTRAINT fk_PlayList_Song2 FOREIGN KEY (id_playlist)
> >>REFERENCES PlayList (id)
> >>ON DELETE CASCADE
> >>ON UPDATE CASCADE);
> >>
> >> CREATE INDEX PlayList_Song_song_number_idx ON
> PlayList_Song(song_number);
> >>
> >> Now I need to scroll title filtered by genre_id and artist both in Song
> >> table and Playlist.
> >> The query for the first case is very fast:
> >> SELECT id AS number,title  FROM Song WHERE genre_id = 0 AND artist =
> >> 'Las ketchup'
> >> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
> >> version)' OR number > 258)
> >> ORDER BY title ASC , number ASC LIMIT 4;
> >>
> >> The second case is about 35 times slower... so the scrolling is quite
> >> impossible (or useless)!
> >> SELECT song_number AS number,title  FROM Song AS S, Playlist_Song AS PS
> >> WHERE S.id = PS.id_song AND
> >> PS.id_playlist = 2 AND genre_id = 0 AND artist = 'Las ketchup'
> >> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
> >> version)' OR number > 959)
> >> ORDER BY title ASC , number ASC LIMIT 4;
> >>
> >> I also execute the EXPLAIN QUERY PLAN:
> >> 1st query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
> >>
> >> 2nd query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
> >>  1 1  TABLE Playlist_Song AS PS
> >> So it seems that the second plan (1,1) requires very long time!
> >> How can I 

Re: [sqlite] join performance query

2010-05-11 Thread Andrea Galeazzi
Sorry but in your solution, how can I solve the condition
AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
>> version)' OR number > 258)
?
title is on song and number is song_number on Playlist_Song AS PS.
Furthermore I also need title and number in place of your  select * from SONG
Could you write it again please?
Thanks

Citando Tim Romano :

> 1. Try discrete single-column indexes rather than multi-column composite
> indexes.
> 2. Try  breaking the query down into subsets expressed as parenthetical
> queries; you can treat these parenthetical queries as if they were tables by
> assigning them an alias, and then you can join against the aliases.   I have
> sped queries up in SQLite using this approach and, with a little tinkering,
> the time can drop from over a minute to sub-second.   Performance will
> depend on the indexes and criteria used, of course. But this approach lets
> you see how SQLite is optimizing the creation of the component sets from
> which you can build up your ultimate query.
> .
> select * from SONG
> JOIN
>
> ( select  id_song from
>
> (
> select  id_song from PLAYLIST_SONG
> where id_playlist=2
> ) as MYPLAYLISTSONGS
>
> JOIN
>
> (
> select id_song from
> SONG
> where genre_id = 0 AND artist = 'Las ketchup'
> AND title >= 'Asereje(karaoke version)'
> ) as MYSONGS
>
> on MYSONGS.id_song = MYPLAYLISTSONGS.id_song
>
>
> ) as SONGIDLIST
>
> on SONG.id_song = SONGIDLIST.id_song
>
>
> Regards
> Tim Romano
>
>
>
>
>
>
> On Tue, May 11, 2010 at 6:07 AM, Andrea Galeazzi  wrote:
>
>> Hi guys,
>> I'm in a bind for a huge time consuming query!
>> I made the following database schema:
>>
>> CREATE TABLE Song (
>>idINTEGER NOT NULL UNIQUE,
>>titleVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
>>artistVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
>>genre_idINT NOT NULL DEFAULT 0,
>> PRIMARY KEY (id),
>>
>> CONSTRAINT fk_Genre FOREIGN KEY (genre_id)
>>REFERENCES Genre (id)
>>ON DELETE SET DEFAULT
>>ON UPDATE CASCADE);
>>
>> CREATE INDEX Song_Browse_View_idx ON Song(genre_id,artist,title);
>>
>> CREATE TABLE PlayList (
>>id INTEGER NOT NULL UNIQUE,
>>name   VARCHAR(15) NOT NULL COLLATE NOCASE, --KORGCOLLATE,
>>length INT NOT NULL DEFAULT 0,
>>created_date   TEXT,
>> PRIMARY KEY (id));
>>
>> CREATE TABLE PlayList_Song (
>>id_song INT NOT NULL,
>>id_playlist INT NOT NULL,
>>song_number INTEGER NOT NULL,
>> PRIMARY KEY (id_playlist, song_number),
>> CONSTRAINT fk_PlayList_Song1 FOREIGN KEY (id_song)
>>REFERENCES Song (id)
>>ON DELETE CASCADE
>>ON UPDATE CASCADE,
>> CONSTRAINT fk_PlayList_Song2 FOREIGN KEY (id_playlist)
>>REFERENCES PlayList (id)
>>ON DELETE CASCADE
>>ON UPDATE CASCADE);
>>
>> CREATE INDEX PlayList_Song_song_number_idx ON PlayList_Song(song_number);
>>
>> Now I need to scroll title filtered by genre_id and artist both in Song
>> table and Playlist.
>> The query for the first case is very fast:
>> SELECT id AS number,title  FROM Song WHERE genre_id = 0 AND artist =
>> 'Las ketchup'
>> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
>> version)' OR number > 258)
>> ORDER BY title ASC , number ASC LIMIT 4;
>>
>> The second case is about 35 times slower... so the scrolling is quite
>> impossible (or useless)!
>> SELECT song_number AS number,title  FROM Song AS S, Playlist_Song AS PS
>> WHERE S.id = PS.id_song AND
>> PS.id_playlist = 2 AND genre_id = 0 AND artist = 'Las ketchup'
>> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
>> version)' OR number > 959)
>> ORDER BY title ASC , number ASC LIMIT 4;
>>
>> I also execute the EXPLAIN QUERY PLAN:
>> 1st query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
>>
>> 2nd query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
>>  1 1  TABLE Playlist_Song AS PS
>> So it seems that the second plan (1,1) requires very long time!
>> How can I optimized a such kind of query?
>> Cheers
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] join performance query

2010-05-11 Thread Tim Romano
1. Try discrete single-column indexes rather than multi-column composite
indexes.
2. Try  breaking the query down into subsets expressed as parenthetical
queries; you can treat these parenthetical queries as if they were tables by
assigning them an alias, and then you can join against the aliases.   I have
sped queries up in SQLite using this approach and, with a little tinkering,
the time can drop from over a minute to sub-second.   Performance will
depend on the indexes and criteria used, of course. But this approach lets
you see how SQLite is optimizing the creation of the component sets from
which you can build up your ultimate query.
.
select * from SONG
JOIN

( select  id_song from

(
select  id_song from PLAYLIST_SONG
where id_playlist=2
) as MYPLAYLISTSONGS

JOIN

(
select id_song from
SONG
where genre_id = 0 AND artist = 'Las ketchup'
AND title >= 'Asereje(karaoke version)'
) as MYSONGS

on MYSONGS.id_song = MYPLAYLISTSONGS.id_song


) as SONGIDLIST

on SONG.id_song = SONGIDLIST.id_song


Regards
Tim Romano






On Tue, May 11, 2010 at 6:07 AM, Andrea Galeazzi  wrote:

> Hi guys,
> I'm in a bind for a huge time consuming query!
> I made the following database schema:
>
> CREATE TABLE Song (
>idINTEGER NOT NULL UNIQUE,
>titleVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
>artistVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
>genre_idINT NOT NULL DEFAULT 0,
> PRIMARY KEY (id),
>
> CONSTRAINT fk_Genre FOREIGN KEY (genre_id)
>REFERENCES Genre (id)
>ON DELETE SET DEFAULT
>ON UPDATE CASCADE);
>
> CREATE INDEX Song_Browse_View_idx ON Song(genre_id,artist,title);
>
> CREATE TABLE PlayList (
>id INTEGER NOT NULL UNIQUE,
>name   VARCHAR(15) NOT NULL COLLATE NOCASE, --KORGCOLLATE,
>length INT NOT NULL DEFAULT 0,
>created_date   TEXT,
> PRIMARY KEY (id));
>
> CREATE TABLE PlayList_Song (
>id_song INT NOT NULL,
>id_playlist INT NOT NULL,
>song_number INTEGER NOT NULL,
> PRIMARY KEY (id_playlist, song_number),
> CONSTRAINT fk_PlayList_Song1 FOREIGN KEY (id_song)
>REFERENCES Song (id)
>ON DELETE CASCADE
>ON UPDATE CASCADE,
> CONSTRAINT fk_PlayList_Song2 FOREIGN KEY (id_playlist)
>REFERENCES PlayList (id)
>ON DELETE CASCADE
>ON UPDATE CASCADE);
>
> CREATE INDEX PlayList_Song_song_number_idx ON PlayList_Song(song_number);
>
> Now I need to scroll title filtered by genre_id and artist both in Song
> table and Playlist.
> The query for the first case is very fast:
> SELECT id AS number,title  FROM Song WHERE genre_id = 0 AND artist =
> 'Las ketchup'
> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
> version)' OR number > 258)
> ORDER BY title ASC , number ASC LIMIT 4;
>
> The second case is about 35 times slower... so the scrolling is quite
> impossible (or useless)!
> SELECT song_number AS number,title  FROM Song AS S, Playlist_Song AS PS
> WHERE S.id = PS.id_song AND
> PS.id_playlist = 2 AND genre_id = 0 AND artist = 'Las ketchup'
> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
> version)' OR number > 959)
> ORDER BY title ASC , number ASC LIMIT 4;
>
> I also execute the EXPLAIN QUERY PLAN:
> 1st query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
>
> 2nd query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
>  1 1  TABLE Playlist_Song AS PS
> So it seems that the second plan (1,1) requires very long time!
> How can I optimized a such kind of query?
> Cheers
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] join performance query

2010-05-11 Thread Simon Davies
On 11 May 2010 11:07, Andrea Galeazzi  wrote:
> Hi guys,
> I'm in a bind for a huge time consuming query!
.
.
.
> The second case is about 35 times slower... so the scrolling is quite
> impossible (or useless)!
> SELECT song_number AS number,title  FROM Song AS S, Playlist_Song AS PS
> WHERE S.id = PS.id_song AND
> PS.id_playlist = 2 AND genre_id = 0 AND artist = 'Las ketchup'
> AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke
> version)' OR number > 959)
> ORDER BY title ASC , number ASC LIMIT 4;
>
> I also execute the EXPLAIN QUERY PLAN:
> 1st query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
>
> 2nd query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
>                  1 1  TABLE Playlist_Song AS PS
> So it seems that the second plan (1,1) requires very long time!
> How can I optimized a such kind of query?

You can see that there is no index being used for looking up data on
table Playlist_Song. A good first step to improve performance is to
add an index that will be used for this query:
  create index playlistSong_id_song on Playlist_Song( id_song );
or
  create index playlistSong_id_playlist on Playlist_Song( id_playlist );

> Cheers
>

Regards,
Simon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] join performance query

2010-05-11 Thread Andrea Galeazzi
Hi guys,
I'm in a bind for a huge time consuming query!
I made the following database schema:

CREATE TABLE Song (
idINTEGER NOT NULL UNIQUE,
titleVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
artistVARCHAR(40) NOT NULL DEFAULT '' COLLATE NOCASE,
genre_idINT NOT NULL DEFAULT 0,
PRIMARY KEY (id),

CONSTRAINT fk_Genre FOREIGN KEY (genre_id)
REFERENCES Genre (id)
ON DELETE SET DEFAULT
ON UPDATE CASCADE);

CREATE INDEX Song_Browse_View_idx ON Song(genre_id,artist,title);

CREATE TABLE PlayList (
id INTEGER NOT NULL UNIQUE,
name   VARCHAR(15) NOT NULL COLLATE NOCASE, --KORGCOLLATE,
length INT NOT NULL DEFAULT 0,
created_date   TEXT,
PRIMARY KEY (id));

CREATE TABLE PlayList_Song (
id_song INT NOT NULL,
id_playlist INT NOT NULL,
song_number INTEGER NOT NULL,
PRIMARY KEY (id_playlist, song_number),
CONSTRAINT fk_PlayList_Song1 FOREIGN KEY (id_song)
REFERENCES Song (id)
ON DELETE CASCADE
ON UPDATE CASCADE,
CONSTRAINT fk_PlayList_Song2 FOREIGN KEY (id_playlist)
REFERENCES PlayList (id)
ON DELETE CASCADE
ON UPDATE CASCADE);

CREATE INDEX PlayList_Song_song_number_idx ON PlayList_Song(song_number);

Now I need to scroll title filtered by genre_id and artist both in Song 
table and Playlist.
The query for the first case is very fast:
SELECT id AS number,title  FROM Song WHERE genre_id = 0 AND artist = 
'Las ketchup'
AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke 
version)' OR number > 258)
ORDER BY title ASC , number ASC LIMIT 4;

The second case is about 35 times slower... so the scrolling is quite 
impossible (or useless)!
SELECT song_number AS number,title  FROM Song AS S, Playlist_Song AS PS 
WHERE S.id = PS.id_song AND
PS.id_playlist = 2 AND genre_id = 0 AND artist = 'Las ketchup'
AND title >= 'Asereje(karaoke version)' AND (title > 'Asereje(karaoke 
version)' OR number > 959)
ORDER BY title ASC , number ASC LIMIT 4;

I also execute the EXPLAIN QUERY PLAN:
1st query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY

2nd query:  0 0 TABLE Song WITH INDEX Song_Browse_View_idx ORDER BY
  1 1  TABLE Playlist_Song AS PS
So it seems that the second plan (1,1) requires very long time!
How can I optimized a such kind of query?
Cheers

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-16 Thread Igor Tandetnik
Gert Cuykens 
wrote:
> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
> from PRODUCTS t left join ORDERS o on t.pid = o.pid group by t.pid
>
> Works also thanks.
> So you do not believe the following has a performance penalty ?
>
> SELECT pid, txt, price, qty-coalesce((SELECT sum(qty) FROM orders
> WHERE orders.pid = products.pid),0) FROM products

I don't see why it should, but when in doubt, test and measure.

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-16 Thread Gert Cuykens
On Wed, Sep 16, 2009 at 9:50 PM, Igor Tandetnik  wrote:
> Gert Cuykens 
> wrote:
>> On Wed, Sep 16, 2009 at 8:26 PM, Igor Tandetnik
>>  wrote:
>>> Gert Cuykens 
>>> wrote:
 On Wed, Sep 16, 2009 at 1:35 AM, Igor Tandetnik
  wrote:
>>>
> Perhaps your query could be a bit clearer when written this way:
>
> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0)
> onhand_qty from PRODUCTS t left join ORDERS o on t.pid = o.pid

 This does not show me the new products that are not ordered yet
>>>
>>> Are you sure? I don't see why it wouldn't.
>>
>> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
>> from PRODUCTS t left join ORDERS o on t.pid = o.pid
>>
>> shows 1 product because I only have 1 order with that product
>
> This query is different from the one I posted. Mine had "where t.pid =
> ?;", yours doesn't. If you want a report for all products, add the
> following clause:
>
> group by t.pid;
>
>> SELECT t.pid,
>>         t.txt,
>>         t.price,
>>         t.qty - IFNULL(qs.qty_sold, 0) "onhand_qty"
>>    FROM PRODUCTS t
>> LEFT JOIN (SELECT o.pid,
>>                 SUM(o.qty) "qty_sold"
>>            FROM ORDERS o) qs ON qs."o.pid" = t.pid
>>
>> shows all products
>
> Compare this statement with the one you mentioned in your original post.
> Lacking mind-reading abilities, I was trying to help you with the
> statement you actually asked about, not the one you were thinking about.

select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
from PRODUCTS t left join ORDERS o on t.pid = o.pid group by t.pid

Works also thanks.
So you do not believe the following has a performance penalty ?

SELECT pid, txt, price, qty-coalesce((SELECT sum(qty) FROM orders
WHERE orders.pid = products.pid),0) FROM products
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-16 Thread Igor Tandetnik
Gert Cuykens 
wrote:
> On Wed, Sep 16, 2009 at 8:26 PM, Igor Tandetnik
>  wrote:
>> Gert Cuykens 
>> wrote:
>>> On Wed, Sep 16, 2009 at 1:35 AM, Igor Tandetnik
>>>  wrote:
>>
 Perhaps your query could be a bit clearer when written this way:

 select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0)
 onhand_qty from PRODUCTS t left join ORDERS o on t.pid = o.pid
>>>
>>> This does not show me the new products that are not ordered yet
>>
>> Are you sure? I don't see why it wouldn't.
>
> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
> from PRODUCTS t left join ORDERS o on t.pid = o.pid
>
> shows 1 product because I only have 1 order with that product

This query is different from the one I posted. Mine had "where t.pid = 
?;", yours doesn't. If you want a report for all products, add the 
following clause:

group by t.pid;

> SELECT t.pid,
> t.txt,
> t.price,
> t.qty - IFNULL(qs.qty_sold, 0) "onhand_qty"
>FROM PRODUCTS t
> LEFT JOIN (SELECT o.pid,
> SUM(o.qty) "qty_sold"
>FROM ORDERS o) qs ON qs."o.pid" = t.pid
>
> shows all products

Compare this statement with the one you mentioned in your original post. 
Lacking mind-reading abilities, I was trying to help you with the 
statement you actually asked about, not the one you were thinking about.

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-16 Thread Gert Cuykens
On Wed, Sep 16, 2009 at 8:26 PM, Igor Tandetnik  wrote:
> Gert Cuykens 
> wrote:
>> On Wed, Sep 16, 2009 at 1:35 AM, Igor Tandetnik 
>> wrote:
>
>>> Perhaps your query could be a bit clearer when written this way:
>>>
>>> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0)
>>> onhand_qty from PRODUCTS t left join ORDERS o on t.pid = o.pid
>>
>> This does not show me the new products that are not ordered yet
>
> Are you sure? I don't see why it wouldn't.

select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
from PRODUCTS t left join ORDERS o on t.pid = o.pid

shows 1 product because I only have 1 order with that product

SELECT t.pid,
 t.txt,
 t.price,
 t.qty - IFNULL(qs.qty_sold, 0) "onhand_qty"
FROM PRODUCTS t
LEFT JOIN (SELECT o.pid,
 SUM(o.qty) "qty_sold"
FROM ORDERS o) qs ON qs."o.pid" = t.pid

shows all products

>>> or this way
>>>
>>> select t.pid, t.txt, t.price, t.qty - IFNULL(
>>> (select sum(o.qty) from ORDERS o where t.pid = o.pid), 0) onhand_qty
>>> from PRODUCTS t where t.pid = ?;
>>>
>>
>> I learned that this would be a performance issue doing it like that.
>>
>> http://stackoverflow.com/questions/1417889/sqlite3-get-product-onhand-quantity
>
> You "learned"? Do you mean, you measured and discovered this to be the
> case? The answer in that thread you are basing this claim on is largely
> nonsense, in my humble opinion. For one thing, your query only returns
> one row, so running a subselect "for every row returned" means running
> it once. For another, how does the poster believe joins are calculated -
> black magic?
>

The where t.pid=? should have been removed from my original question,
so it show the complete list of products.
I did not measured it, it sounded logic.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-16 Thread Igor Tandetnik
Gert Cuykens 
wrote:
> On Wed, Sep 16, 2009 at 1:35 AM, Igor Tandetnik 
> wrote:

>> Perhaps your query could be a bit clearer when written this way:
>>
>> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0)
>> onhand_qty from PRODUCTS t left join ORDERS o on t.pid = o.pid
>
> This does not show me the new products that are not ordered yet

Are you sure? I don't see why it wouldn't.

>> or this way
>>
>> select t.pid, t.txt, t.price, t.qty - IFNULL(
>> (select sum(o.qty) from ORDERS o where t.pid = o.pid), 0) onhand_qty
>> from PRODUCTS t where t.pid = ?;
>>
>
> I learned that this would be a performance issue doing it like that.
>
> http://stackoverflow.com/questions/1417889/sqlite3-get-product-onhand-quantity

You "learned"? Do you mean, you measured and discovered this to be the 
case? The answer in that thread you are basing this claim on is largely 
nonsense, in my humble opinion. For one thing, your query only returns 
one row, so running a subselect "for every row returned" means running 
it once. For another, how does the poster believe joins are calculated - 
black magic?

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-16 Thread Gert Cuykens
On Wed, Sep 16, 2009 at 1:35 AM, Igor Tandetnik  wrote:
> Gert Cuykens 
> wrote:
>>   SELECT t.pid,
>>          t.txt,
>>          t.price,
>>          t.qty - IFNULL(qs.qty_sold, 0) 'onhand_qty'
>>     FROM PRODUCTS t
>> LEFT JOIN (SELECT o.pid,
>>                  SUM(o.qty) 'qty_sold'
>>             FROM ORDERS o) qs ON qs.pid = t.pid
>>    WHERE t.pid = ?
>>
>> i have trouble running this statement on sqlite3
>> It tels me qs.pid does not exist
>
> And indeed it doesn't. But there is a column named qs."o.pid". You may
> want to assign an alias to this column, just as you did with qs.qty_sold
>
>> when i rename it to pid it subtracts
>> the wrong values
>
> Because now it refers to t.pid, and the condition (t.pid = t.pid) is
> always true.

qs."o.pid" = t.pid works thanks

> Perhaps your query could be a bit clearer when written this way:
>
> select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
> from PRODUCTS t left join ORDERS o on t.pid = o.pid

This does not show me the new products that are not ordered yet

> or this way
>
> select t.pid, t.txt, t.price, t.qty - IFNULL(
>    (select sum(o.qty) from ORDERS o where t.pid = o.pid), 0) onhand_qty
> from PRODUCTS t where t.pid = ?;
>

I learned that this would be a performance issue doing it like that.

http://stackoverflow.com/questions/1417889/sqlite3-get-product-onhand-quantity
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-15 Thread Igor Tandetnik
Gert Cuykens 
wrote:
>   SELECT t.pid,
>  t.txt,
>  t.price,
>  t.qty - IFNULL(qs.qty_sold, 0) 'onhand_qty'
> FROM PRODUCTS t
> LEFT JOIN (SELECT o.pid,
>  SUM(o.qty) 'qty_sold'
> FROM ORDERS o) qs ON qs.pid = t.pid
>WHERE t.pid = ?
>
> i have trouble running this statement on sqlite3
> It tels me qs.pid does not exist

And indeed it doesn't. But there is a column named qs."o.pid". You may 
want to assign an alias to this column, just as you did with qs.qty_sold

> when i rename it to pid it subtracts
> the wrong values

Because now it refers to t.pid, and the condition (t.pid = t.pid) is 
always true.

Perhaps your query could be a bit clearer when written this way:

select t.pid, t.txt, t.price, t.qty - IFNULL(sum(o.qty), 0) onhand_qty
from PRODUCTS t left join ORDERS o on t.pid = o.pid
where t.pid = ?;

or this way

select t.pid, t.txt, t.price, t.qty - IFNULL(
(select sum(o.qty) from ORDERS o where t.pid = o.pid), 0) onhand_qty
from PRODUCTS t where t.pid = ?;

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-15 Thread Simon Davies
009/9/15 Gert Cuykens :
>   SELECT t.pid,
>          t.txt,
>          t.price,
>          t.qty - IFNULL(qs.qty_sold, 0) 'onhand_qty'
>     FROM PRODUCTS t
> LEFT JOIN (SELECT o.pid,
>                  SUM(o.qty) 'qty_sold'
>             FROM ORDERS o) qs ON qs.pid = t.pid
>    WHERE t.pid = ?
>
> i have trouble running this statement on sqlite3
> It tels me qs.pid does not exist, when i rename it to pid it subtracts
> the wrong values when ORDERS contains one record and PRODUCTS more the
> one?

It's late, so I am not putting in the time to generate test data in
your schema, but you should probably ensure that your result column in
your result table is aliased properly:

  SELECT t.pid,
 t.txt,
 t.price,
 t.qty - IFNULL(qs.qty_sold, 0) 'onhand_qty'
FROM PRODUCTS t
LEFT JOIN (SELECT o.pid as pid, -- alias this column
 SUM(o.qty) 'qty_sold'
FROM ORDERS o) qs ON qs.pid = t.pid
   WHERE t.pid = ?;

it certainly resolves error messages about "qs.pid does not exist".

>
>
> CREATE TABLE PRODUCTS (
>    pid     INTEGER PRIMARY KEY,
>    txt     VARCHAR(64),
>    price   BIGINT UNSIGNED,
>    qty     BIGINT
> );
>
> CREATE TABLE ORDERS (
>    oid     INTEGER,
>    pid     BIGINT UNSIGNED,
>    qty     BIGINT UNSIGNED,
>    time   DATETIME,
>    PRIMARY KEY(oid,pid)
> );

You should also be aware that single quotes are delimiters for
literals, not for identifiers. For identifiers use double quotes.
(SQLite is tolerant of the usage in the case above).

Regards,
Simon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN (SELECT o.pid, SUM(o.qty) 'qty_sold' FROM ORDERS o) qs

2009-09-15 Thread Gert Cuykens
   SELECT t.pid,
  t.txt,
  t.price,
  t.qty - IFNULL(qs.qty_sold, 0) 'onhand_qty'
 FROM PRODUCTS t
LEFT JOIN (SELECT o.pid,
  SUM(o.qty) 'qty_sold'
 FROM ORDERS o) qs ON qs.pid = t.pid
WHERE t.pid = ?

i have trouble running this statement on sqlite3
It tels me qs.pid does not exist, when i rename it to pid it subtracts
the wrong values when ORDERS contains one record and PRODUCTS more the
one?


CREATE TABLE PRODUCTS (
pid INTEGER PRIMARY KEY,
txt VARCHAR(64),
price   BIGINT UNSIGNED,
qty BIGINT
);

CREATE TABLE ORDERS (
oid INTEGER,
pid BIGINT UNSIGNED,
qty BIGINT UNSIGNED,
time   DATETIME,
PRIMARY KEY(oid,pid)
);
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-06-01 Thread BardzoTajneKonto

> Do other SQL database engines not have this same limitation?" Are MySQL 
> and PostgreSQL and Firebird and MS-SQL and Oracle creating phantom 
> indices on-the-fly to help them do joins faster, for example?" Or do 
> their optimizers do a better job of finding ways to use indices in a 
> join?" Can somebody supply me with specific examples of joins that other 
> database engines do efficiently but that SQLite does slowly?"
 
Acoording to SQLite wiki other databases do better job without indices:
"
Test 6: INNER JOIN without an index
SELECT t1.a FROM t1 INNER JOIN t2 ON t1.b=t2.b; 
SQLite 3.3.3 (sync):14.473 
SQLite 3.3.3 (nosync):  14.445 
SQLite 2.8.17 (sync):   47.776 
SQLite 2.8.17 (nosync): 47.750 
PostgreSQL 8.1.2:   0.176 
MySQL 5.0.18 (sync):3.421 
MySQL 5.0.18 (nosync):  3.443 
FirebirdSQL 1.5.2:  0.141 
"

> Is join 
> efficiency really a frustration to many SQLite users?

Generally not, however the behaviour could be more user friendly. The way is I 
use SQLite is probably not common becase I don't write queries - apllication's 
users write them. I also deal with quite large data. The biggest problem with 
the way joins work is with subqueries. If flattening cannot be done the query 
runs slow. For example I was told by an user that joins on views are really 
slow (on large data it means that doesn't work at all).
The are other minor problems:
1. Creating indices on every (possibly very large) table makes database file 
much bigger that it would be if SQLite used temporary indices created before 
query is run.
2. Database users need to know how exaclty how SQLite work. That is not problem 
if programmer write queries, but can be a problem if database is used by a 
mathematician who doesn't really care about it and simply wants to do some 
calculations.

--
Chcesz miec nawigacje GPS ?
Zamow lub przedluz umowe na neostrade, a nawigacja bedzie Twoja.
Kliknij na link po szczegoly! http://link.interia.pl/f219a


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-31 Thread Thomas Briggs
   As others have already mentioned, hash joins can help in a
situation where there are no appropriate indexes.  They can make
things worse if the inputs aren't large enough though, so there's
still some gray area.

   The biggest thing that other databases have going for them - MSSQL
and Oracle at least - is parallelism.  If you've got 8 or 16 or 32
threads available to you, and plenty of RAM to boot, it's often faster
to ignore the indexes and either hash join or nested loop join subsets
of the affected tables.  Thus situations where there are no indexes
seem better too, and SQLite can look bad in comparison.  'tis the
price paid for being a zero-config embedded database vs. a full-blown
client/server database system, that's all.

   -T

On Sat, May 30, 2009 at 11:11 AM, D. Richard Hipp  wrote:
> There has been a recent flurry of comments about SQLite at
>
>     http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>     http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That
> is true if SQLite is unable to figure out how to use an index to speed
> the join.  I was under the impression that SQLite actually did a
> fairly reasonable job of making use of indices, if they exist.  But
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?
> Or do their optimizers do a better job of finding ways to use indices
> in a join?  Can somebody supply me with specific examples of joins
> that other database engines do efficiently but that SQLite does
> slowly?  Is join efficiency really a frustration to many SQLite users?
>
> Curiously, in some of our own internal tests, SQLite is much, much
> faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is
> large - greater than 20 or 30.  (SQLite can handle up to a 64-way
> join.)  This is because SQLite uses a O(k*k) greedy algorithm for
> selecting the ordering of tables in the join whereas the other guys
> all do a much more extensive search.  So the performance loss in the
> other engines is due to the excessive time spent in the query planner,
> not the time actually running the query.  SQLite can plan a 64-way
> join in the blink of an eye, whereas PostgreSQL requires several
> minutes.
>
> But for the tests described in the previous paragraph, there were
> always good indices so that the time to actually run the join was
> approximately linear.  What about situations where you have a 4- or 5-
> way join on tables that are not indexed?  Do other database engines
> handle those more efficiently than SQLite somehow?  Is this something
> we need to look into?
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-31 Thread Florian Weimer
* D. Richard Hipp:

> One of the criticisms of SQLite is that it is slow to do joins.  That  
> is true if SQLite is unable to figure out how to use an index to speed  
> the join.  I was under the impression that SQLite actually did a  
> fairly reasonable job of making use of indices, if they exist.  But  
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are  
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating  
> phantom indices on-the-fly to help them do joins faster, for example?   

PostgreSQL roughly does one of the following (when dealing with a
two-way join):

  * If one side of the join is estimated to be a small set, PostgreSQL
performs a sequential scan on it, hashes it, and joins the other
table in a hash join.

  * If both sides are large, each side is sorted, and a merge join is
performed.

Things go horribly wrong if the estimates are off and the wrong plan
is picked.

There's also a nested loop join (which would be what SQLite does), but
I haven't seen it in recent version.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Nicolas Williams
On Sat, May 30, 2009 at 07:01:31PM +0100, Simon Slavin wrote:
> I'm interested in how sqlite works differently to the SQL systems  
> which keep a daemon running as a background task.  One of the  
> advantages of having a daemon which persists between runs of an  
> application is that the daemon can keep its own list of ORDERs, and  
> JOINs which are asked for frequently, and decide to maintain them even  
> when no SQL-using application is running.  [...]

You don't need a daemon to do that.  One could use a special table in
the database itself (much like the master table) to keep statistics
about all sorts of things.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Jim Wilcoxson
SQLite has surprised me with its quick performance, not the other way
around.  In fact, I've implemented all kinds of lookup queries that I
knew could be optimized by caching results so I didn't have to keep
repeating the SQL query, but the performance was so good even
repeating the queries that I never bothered with the caching.

I'm sure there are queries that SQLite doesn't run as fast as database
product X, and I'm sure it goes the other way too.  It's a balancing
act, and as the primary developer, you have to choose for us what's
important to optimize and what isn't.

So far, I'm very happy with the choices and trade-offs that have been
made in SQLite. :-)

Jim

On 5/30/09, Simon Slavin  wrote:
> I'm interested in how sqlite works differently to the SQL systems
> which keep a daemon running as a background task.  One of the
> advantages of having a daemon which persists between runs of an
> application is that the daemon can keep its own list of ORDERs, and
> JOINs which are asked for frequently, and decide to maintain them even
> when no SQL-using application is running.  This can give the
> impression that something is being done very quickly, when in fact the
> majority of the time was taken during a previous run of the
> application.  It can be particularly hard to figure out what a
> performance test means under these circumstances.
>
> But the problem is that I like the way sqlite works.  I like the tiny
> library, I like the way that the SQL library is entirely inside my
> application, and any CPU load is mine.  I like knowing that when my
> app quits, nothing is going on.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Simon Slavin
I'm interested in how sqlite works differently to the SQL systems  
which keep a daemon running as a background task.  One of the  
advantages of having a daemon which persists between runs of an  
application is that the daemon can keep its own list of ORDERs, and  
JOINs which are asked for frequently, and decide to maintain them even  
when no SQL-using application is running.  This can give the  
impression that something is being done very quickly, when in fact the  
majority of the time was taken during a previous run of the  
application.  It can be particularly hard to figure out what a  
performance test means under these circumstances.

But the problem is that I like the way sqlite works.  I like the tiny  
library, I like the way that the SQL library is entirely inside my  
application, and any CPU load is mine.  I like knowing that when my  
app quits, nothing is going on.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread John Elrick
D. Richard Hipp wrote:
> There has been a recent flurry of comments about SQLite at
>
>  http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>  http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That  
> is true if SQLite is unable to figure out how to use an index to speed  
> the join.  I was under the impression that SQLite actually did a  
> fairly reasonable job of making use of indices, if they exist.  But  
> without indices, an k-way join takes time proportional to N^k.
>   

We're finishing a system which produces auto generated queries where k 
can potentially range from 1 to 16.  The only times I have seen Sqlite 
slowdown were when the indexes needed tweaking.

I noticed one of the comments made was that a three inner join query 
took 10 minutes and that the joins should have filtered the table, 
however, I did not notice any example, contrived or obfuscated, which 
would demonstrate the issue.  This absence means we must rely upon the 
author's interpretation of what the query did being accurate.  The 
author makes the assertion that each progressive inner join should make 
the search table smaller, however, just because that was the intent does 
not make it the reality.  The fact that temp tables with no apparent 
indexing ran faster makes me question whether or not the initial 
assumption was true.

I've been wrong far too often about the actual vs theoretical of my code 
in operation to accept anything such as this at face value; however, 
that caveat would apply from both directions.  The author said he 
reported it as a bug, which implies he presented a repeatable test 
case.  If that is so, that specific test case might merit further 
examination.

FWIW


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Mark Hamburg
Assuming memory is sufficiently inexpensive, I would think that it  
would almost always be useful to build an index for any field in a  
join rather than doing a full scan. (Or better yet, build a hash table  
if memory is sufficient.) Indices maintained in the database then  
become optimizations to avoid starting the query with an index build.

Mark

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Pavel Ivanov
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?

Sort of. There's 2 types of join methods in Oracle for this - Hash
joins and Sort merge joins - when server creates in memory (or in
temporary storage in general) sorted data for one or both merging data
sets and then merges these sets. You can read about it here:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i51523.
Not sure though if it's worth to implement such technique in SQLite.

Pavel

On Sat, May 30, 2009 at 11:11 AM, D. Richard Hipp  wrote:
> There has been a recent flurry of comments about SQLite at
>
>     http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>     http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That
> is true if SQLite is unable to figure out how to use an index to speed
> the join.  I was under the impression that SQLite actually did a
> fairly reasonable job of making use of indices, if they exist.  But
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?
> Or do their optimizers do a better job of finding ways to use indices
> in a join?  Can somebody supply me with specific examples of joins
> that other database engines do efficiently but that SQLite does
> slowly?  Is join efficiency really a frustration to many SQLite users?
>
> Curiously, in some of our own internal tests, SQLite is much, much
> faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is
> large - greater than 20 or 30.  (SQLite can handle up to a 64-way
> join.)  This is because SQLite uses a O(k*k) greedy algorithm for
> selecting the ordering of tables in the join whereas the other guys
> all do a much more extensive search.  So the performance loss in the
> other engines is due to the excessive time spent in the query planner,
> not the time actually running the query.  SQLite can plan a 64-way
> join in the blink of an eye, whereas PostgreSQL requires several
> minutes.
>
> But for the tests described in the previous paragraph, there were
> always good indices so that the time to actually run the join was
> approximately linear.  What about situations where you have a 4- or 5-
> way join on tables that are not indexed?  Do other database engines
> handle those more efficiently than SQLite somehow?  Is this something
> we need to look into?
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join performance in SQLite

2009-05-30 Thread D. Richard Hipp
There has been a recent flurry of comments about SQLite at

 http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
 http://news.ycombinator.com/item?id=633151

One of the criticisms of SQLite is that it is slow to do joins.  That  
is true if SQLite is unable to figure out how to use an index to speed  
the join.  I was under the impression that SQLite actually did a  
fairly reasonable job of making use of indices, if they exist.  But  
without indices, an k-way join takes time proportional to N^k.

Do other SQL database engines not have this same limitation?  Are  
MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating  
phantom indices on-the-fly to help them do joins faster, for example?   
Or do their optimizers do a better job of finding ways to use indices  
in a join?  Can somebody supply me with specific examples of joins  
that other database engines do efficiently but that SQLite does  
slowly?  Is join efficiency really a frustration to many SQLite users?

Curiously, in some of our own internal tests, SQLite is much, much  
faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is  
large - greater than 20 or 30.  (SQLite can handle up to a 64-way  
join.)  This is because SQLite uses a O(k*k) greedy algorithm for  
selecting the ordering of tables in the join whereas the other guys  
all do a much more extensive search.  So the performance loss in the  
other engines is due to the excessive time spent in the query planner,  
not the time actually running the query.  SQLite can plan a 64-way  
join in the blink of an eye, whereas PostgreSQL requires several  
minutes.

But for the tests described in the previous paragraph, there were  
always good indices so that the time to actually run the join was  
approximately linear.  What about situations where you have a 4- or 5- 
way join on tables that are not indexed?  Do other database engines  
handle those more efficiently than SQLite somehow?  Is this something  
we need to look into?

D. Richard Hipp
d...@hwaci.com



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join Me at Multiply

2008-12-17 Thread Alexander Yap (via Multiply)

Check out my Multiply site

I set up a Multiply site with my pictures, videos and blog and I want 
to add you as my friend so you can see it. First, you need to join 
Multiply! Once you join, you can also create your own site and share 
anything you want, with anyone you want.

Here's the link:
http://multiply.com/si/RkSMR++16o5puCMnUaUn,g

Thanks,
Alexander







Stop e-mails, view our privacy policy, or report abuse: 
http://multiply.com/bl/RkSMR++16o5puCMnUaUn,g
We haven't added your email address to any lists, nor will we share it
with anyone at any time.
Copyright 2008 Multiply Inc.
6001 Park of Commerce, Boca Raton, FL 33487, USA

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN problem (works in postgres)

2008-11-04 Thread Igor Tandetnik
Ludvig Strigeus <[EMAIL PROTECTED]>
wrote:
> The following query does not work in Sqlite:
>
> SELECT i.user, ia.key, ia.value
>   FROM invite AS i
>   JOIN (invite AS j JOIN users AS u ON j.user = u.id AND
> u.canonical_username='ludde') ON i.parent = j.id
>   LEFT JOIN inviteattr as ia ON ia.invite = i.id;
>
> It complains about "no such column: j.id". Are the 'as' clauses not
> accessible from outside the () in sqlite?

No.

But why do you need those parentheses? Why not just

SELECT i.user, ia.key, ia.value
   FROM invite AS i JOIN invite AS j ON i.parent = j.id
   JOIN users AS u ON (j.user = u.id AND u.canonical_username='ludde')
   LEFT JOIN inviteattr as ia ON ia.invite = i.id;

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN problem (works in postgres)

2008-11-04 Thread Ludvig Strigeus
Hello,

The following query does not work in Sqlite:

SELECT i.user, ia.key, ia.value
   FROM invite AS i
   JOIN (invite AS j JOIN users AS u ON j.user = u.id AND
u.canonical_username='ludde') ON i.parent = j.id
   LEFT JOIN inviteattr as ia ON ia.invite = i.id;

It complains about "no such column: j.id". Are the 'as' clauses not
accessible from outside the () in sqlite?

Is this a bug in sqlite or in my query?, it works fine in postgres.

/Ludde
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN works very strange [3.6.2]

2008-09-20 Thread Jay A. Kreibich
On Sat, Sep 20, 2008 at 08:45:16AM +0400, Alexander Batyrshin scratched on the 
wall:
>  Hello everyone,
> 
> I gets strange result from this query on SQLite-3.6.2
> 
> SELECT
> town.id, town_log.new_player_id, player.name
> FROM
> town_log
> LEFT JOIN town
> LEFT JOIN player
> ON
> town.id = town_log.town_id AND town_log.new_player_id = player.id
> WHERE
> town_log.id = 5195

  "ON" is part of a JOIN operation.  You have two JOINs but only one ON,
  and it is only getting applied to the second JOIN.

  I think you want something closer to this:
  
...
FROM
town_log 
LEFT JOIN town ON town.id = town_log.town_id
LEFT JOIN player ON town_log.new_player_id = player.id
WHERE
town_log.id = 5195



  -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN works very strange [3.6.2]

2008-09-19 Thread Alexander Batyrshin
I just want to add that this SQL query works great at 3.5.4.
And this is explain:

addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Trace  0 0 0 EXPLAIN SELECT
town.id, town_log.new_player_id, player.name
FROM
town_log
LEFT JOIN town
LEFT JOIN player
ON
town.id = town_log.town_id AND town_log.new_player_id = player.id
WHERE
town_log.id = 5195;  00
1 Integer5195  1 000
2 Goto   0 46000
3 SetNumColumns  0 5 000
4 OpenRead   0 14000
5 SetNumColumns  0 1 000
6 OpenRead   1 6 000
7 SetNumColumns  0 2 000
8 OpenRead   2 2 000
9 SetNumColumns  0 2 000
10OpenRead   3 3 0 keyinfo(1,BINARY)  00
11MustBeInt  1 41000
12NotExists  0 41100
13Integer0 3 000
14Rewind 1 38000
15Integer1 3 000
16Integer0 4 000
17Column 0 4 600
18IsNull 6 33000
19Affinity   6 1 0 db 00
20MoveGe 3 336 1  00
21IdxGE  3 336 1  01
22IdxRowid   3 10000
23MoveGe 2 0 10   00
24Column 1 0 10   00
25Column 0 1 11   00
26Ne 113210collseq(BINARY)  6b
27Integer1 4 000
28Column 1 0 12   00
29Column 0 4 13   00
30Column 2 1 14   00
31ResultRow  123 000
32Next   3 21000
33IfPos  4 37000
34NullRow2 0 000
35NullRow3 0 000
36Goto   0 27000
37Next   1 15000
38IfPos  3 41000
39NullRow1 0 000
40Goto   0 15000
41Close  0 0 000
42Close  1 0 000
43Close  2 0 000
44Close  3 0 000
45Halt   0 0 000
46Transaction0 0 000
47VerifyCookie   0 92000
48TableLock  0 140 town_log   00
49TableLock  0 6 0 town   00
50TableLock  0 2 0 player 00
51Goto   0 3 000


PS:
And the last one. I want to say that generally SQLite-3.6.3 is slowly
than 3.5.4 on my OS X 10.5.5.
My result is that 3.6.3 is slowly near 5-7%.

--
Alexander Batyrshin aka bash
bash = Biomechanica Artificial Sabotage Humanoid



On Sat, Sep 20, 2008 at 8:45 AM, Alexander Batyrshin <[EMAIL PROTECTED]> wrote:
>  Hello everyone,
>
> I gets strange result from this query on SQLite-3.6.2
>
> SELECT
>town.id, town_log.new_player_id, player.name
> FROM
>town_log
>LEFT JOIN town
>LEFT JOIN player
> ON
>town.id = town_log.town_id AND town_log.new_player_id = player.id
> WHERE
>town_log.id = 5195
>
> As you can see this query should return only 1 row, because
> town_log.id is unique key.
> But I gets rows with different town.id (1 col), with constant
> new_player.id (2 col) and JOIN do not work for player.name, couse 3
> col is empty.
>
> result:
> []
> 45512   9266
> 44544   9266
> 45229   9266
> 46376   9266
> 45927   9266
> 46645   9266
> []
>
>
> Any ideas what's going on?
>
> --
> Alexander Batyrshin aka bash
> bash = Biomechanica Artificial Sabotage Humanoid
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JOIN works very strange [3.6.2]

2008-09-19 Thread Alexander Batyrshin
 Hello everyone,

I gets strange result from this query on SQLite-3.6.2

SELECT
town.id, town_log.new_player_id, player.name
FROM
town_log
LEFT JOIN town
LEFT JOIN player
ON
town.id = town_log.town_id AND town_log.new_player_id = player.id
WHERE
town_log.id = 5195

As you can see this query should return only 1 row, because
town_log.id is unique key.
But I gets rows with different town.id (1 col), with constant
new_player.id (2 col) and JOIN do not work for player.name, couse 3
col is empty.

result:
[]
45512   9266
44544   9266
45229   9266
46376   9266
45927   9266
46645   9266
[]


Any ideas what's going on?

--
Alexander Batyrshin aka bash
bash = Biomechanica Artificial Sabotage Humanoid
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join criteria referencing case select result in 3.6.2 produces different results from 3.4.2

2008-09-17 Thread Simon Davies
2008/9/17 Dan <[EMAIL PROTECTED]>:
>
>
>>
>> On 3.4.2 we get:
>> 3.0|3660.5|3
>> 6.0|1360.3|6
>>
>> On 3.6.2 we get:
>> |5020.8|3
>>
>
> 3.6.2 has a bug involving DISTINCT or GROUP BY queries that use
> expression aliases (AS clauses) in the select-list. Problem is fixed
> in cvs:
>
>   http://www.sqlite.org/cvstrac/chngview?cn=5712
>
> Dan.
>

Thank you
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join criteria referencing case select result in 3.6.2 produces different results from 3.4.2

2008-09-17 Thread Dan


> Hi All,
> We have been using SQLite 3.4.2 for some time. On investigating
> upgrading to 3.6.2, we found that different results were produced for
> one query.
> The following illustrates:
>

...

>
> On 3.4.2 we get:
> 3.0|3660.5|3
> 6.0|1360.3|6
>
> On 3.6.2 we get:
> |5020.8|3
>

3.6.2 has a bug involving DISTINCT or GROUP BY queries that use
expression aliases (AS clauses) in the select-list. Problem is fixed
in cvs:

   http://www.sqlite.org/cvstrac/chngview?cn=5712

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join criteria referencing case select result in 3.6.2 produces different results from 3.4.2

2008-09-17 Thread Simon Davies
Hi All,
We have been using SQLite 3.4.2 for some time. On investigating
upgrading to 3.6.2, we found that different results were produced for
one query.
The following illustrates:

CREATE TABLE tst1( tst1Id INTEGER, width REAL, thickness REAL );
CREATE TABLE tst2( tst2Id INTEGER, tst3Id INTEGER, setType INTEGER,
length REAL, number INTEGER );
CREATE TABLE tst3( tst3Id INTEGER, setIdA INTEGER, setIdB INTEGER,
setIdC INTEGER, setIdD INTEGER );

INSERT INTO tst1 VALUES( 1, 1.0, 1.1 );
INSERT INTO tst1 VALUES( 2, 2.0, 2.1 );
INSERT INTO tst1 VALUES( 3, 3.0, 3.1 );
INSERT INTO tst1 VALUES( 4, 4.0, 4.1 );
INSERT INTO tst1 VALUES( 5, 5.0, 5.1 );
INSERT INTO tst1 VALUES( 6, 6.0, 6.1 );
INSERT INTO tst1 VALUES( 9, 7.0, 7.1 );

INSERT INTO tst2 VALUES( 1, 1,  1, 101.1, 1 );
INSERT INTO tst2 VALUES( 2, 1, 2, 102.1, 2 );
INSERT INTO tst2 VALUES( 4, 3, 4, 104.1, 2 );
INSERT INTO tst2 VALUES( 5, 4, 3, 105.1, 1 );
INSERT INTO tst2 VALUES( 6, 4, 1, 106.1, 6 );

INSERT INTO tst3 VALUES( 1, 1, 1, 2, 1 );
INSERT INTO tst3 VALUES( 2, 6, 2, 9, 4 );
INSERT INTO tst3 VALUES( 3, 2, 1, 3, 6 );
INSERT INTO tst3 VALUES( 4, 3, 5, 9, 9 );

SELECT
tst1.width,
SUM( tst2.length * tst2.number ) AS totLength,
CASE tst2.setType
WHEN 1 THEN tst3.setIdA
WHEN 2 THEN tst3.setIdB
WHEN 3 THEN tst3.setIdC
ELSE tst3.setIdA
END AS theSetId
FROM
tst2 LEFT OUTER JOIN
tst3 ON tst2.tst3Id=tst2.tst3Id LEFT OUTER JOIN
tst1 ON tst1.tst1Id=theSetId
GROUP BY
tst1.width>4;

On 3.4.2 we get:
3.0|3660.5|3
6.0|1360.3|6

On 3.6.2 we get:
|5020.8|3

We have determined that the query above can be rewritten as:
SELECT
tst1.width,
SUM( tst2.length * tst2.number ) AS totLength
FROM
tst2 LEFT OUTER JOIN
tst3 ON tst2.tst3Id=tst2.tst3Id LEFT OUTER JOIN
tst1 ON tst1.tst1Id=
CASE tst2.setType
WHEN 1 THEN tst3.setIdA
WHEN 2 THEN tst3.setIdB
WHEN 3 THEN tst3.setIdC
ELSE tst3.setIdA
END
GROUP BY
tst1.width>4;

and both versions then give the same result.
However, the SQL from which this is adapted joins several more tables
on the 'theSetId' value, and we wanted only one place to perform
updates.

Anyway, is the behaviour in 3.6.2 correct, or was the behaviour in
3.4.2 incorrect?

Thanking you in advance,
Simon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join Syntax Questions

2008-02-29 Thread Dennis Cote
Mitchell Vincent wrote:
> I could swear I've done this type of thing before and am sure I'm
> overlooking something simple.
> 
> Is this correct syntax?
> 
> SELECT im.invoice_date as invoice_date,im.pay_by as
> due_date,im.invoice_id as invoice_id, im.invoice_number as
> invoice_number,im.invoice_date as created,im.status as status,
> im.next_invoice as next_invoice, im.tax as tax,im.tax2 as
> tax2,im.subtotal as subtotal,im.total as total,im.balance_due as
> balance_due, im.customer_number as customer_number,
> im.customer_name as customer_name FROM invoice_master as im LEFT JOIN
> ( SELECT coalesce(sum(payment_applied), 0.00) as total_paid,invoice_id
> as theiid FROM payments WHERE void='f' AND
> created <= 1204243199) the_payments on im.invoice_id =
> the_payments.theiid WHERE im.invoice_date between 1201478400 And
> 1204243199 AND im.status != 'Forwarded'
> GROUP BY im.invoice_id ORDER BY im.balance_due
> DESC,im.invoice_date,im.total DESC,im.customer_name
> 
> With or without the join I get the exact same result set. I don't even
> see null results for the columns that are supposed to be pulled in
> from the join. I have a habit of mixing SQLite and PostgreSQL syntax,
> have I done it again?
> 

The above statement is basically unreadable.

After it is formatted for human consumption it becomes:

 SELECT
 im.invoice_date as invoice_date,
 im.pay_by as due_date,
 im.invoice_id as invoice_id,
 im.invoice_number as invoice_number,
 im.invoice_date as created,
 im.status as status,
 im.next_invoice as next_invoice,
 im.tax as tax,
 im.tax2 as tax2,
 im.subtotal as subtotal,
 im.total as total,
 im.balance_due as balance_due,
 im.customer_number as customer_number,
 im.customer_name as customer_name
 FROM invoice_master as im
 LEFT JOIN ( SELECT
 coalesce(sum(payment_applied), 0.00) as total_paid,
 invoice_id as theiid
 FROM payments
 WHERE void='f'
 AND created <= 1204243199) the_payments
 on im.invoice_id = the_payments.theiid
 WHERE im.invoice_date between 1201478400 And 1204243199
 AND im.status != 'Forwarded'
 GROUP BY im.invoice_id
 ORDER BY
 im.balance_due DESC,
 im.invoice_date,
 im.total DESC,
 im.customer_name

Where you can easily see that the only columns you are selecting are the 
from the im table (invoice_master). The columns from the the_payments 
sub-query are not selected, so they are not displayed.

HTH
Dennis Cote
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


  1   2   >