Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-11 Thread J. King
Doh. I meant the linked document on the error log. Silly me. 



On September 11, 2017 9:41:39 PM EDT, "J. King"  wrote:
>There's an extra word in the first paragraph of Section 4 of that
>document, by the way:
>
>" The error logger callback has also proven useful in catching errors
>occasional errors that the application misses..."
>
>On September 11, 2017 11:22:50 AM EDT, Dan Kennedy
> wrote:
>>On 09/10/2017 08:30 PM, R Smith wrote:
>>> Well yes but the documentation suggests that one could expect a
>>slight 
>>> degradation. The words "works best with" does not seem to imbue an 
>>> idea of "give WITHOUT ROWID tables a wide berth when your tables are
>
>>> more than few columns wide", and I don't think the Devs intended
>that
>>
>>> either.
>>>
>>> I can still roll with the idea that the WITHOUT ROWID tables with
>151
>>
>>> columns and lots of data actually performed FASTER than the same
>>ROWID 
>>> tables for the other queries but slower with the JOIN... but that
>>much 
>>> slower?
>>
>>
>>
>>The difference seems to be that, currently, SQLite never creates 
>>automatic indexes on WITHOUT ROWID tables.
>>
>>   http://sqlite.org/optoverview.html#autoindex
>>
>>I don't think there is a good reason that it cannot do so. It just 
>>doesn't yet.
>>
>>Dan.
>>
>>
>>
>>
>>
>>>
>>> I mean we are not talking a "little" slower, we are talking 50ms vs.
>
>>> ~70,000ms, that's a difference of a whopping 3 orders of magnitude
>>and 
>>> change. And we are not talking huge tables, another query that
>simply
>>
>>> walks the tables can do so several million times in the same time
>the
>>
>>> JOIN query does.
>>>
>>> Put another way, I can create 100 new tables and populate them each 
>>> with the rows from one test table, then delete the rows from each
>new
>>
>>> table that doesn't satisfy an EXISTS() check in the other test table
>
>>> (essentially mimicking the JOIN query) and then output each full new
>
>>> table, 100 of them in turn, and then DROP them all. SQlite can do
>ALL
>>
>>> of that in a fraction of the time that the normal JOIN query
>(between
>>
>>> those same two test tables) takes to complete.
>>>
>>> We are talking a formula 1 car suddenly going at max speed of 1 
>>> mile-per-weekend, barely keeping up with a semi-athletic snail, and 
>>> people suggest checking the fuel octane rating. I'm saying there is 
>>> something wrong under the hood.
>>>
>>> Cheers,
>>> Ryan
>>>
>>>
>>> On 2017/09/10 11:28 AM, Clemens Ladisch wrote:
 R Smith wrote:
> I am using 151 columns for both tests. The only thing that changes
> between the two scripts are the words "WITHOUT ROWID" being added
  says:
 | WITHOUT ROWID tables will work correctly ... for tables with a
>>single
 | INTEGER PRIMARY KEY. However, ordinary rowid tables will run
>>faster in
 | that case.
 and:
 | WITHOUT ROWID tables work best when individual rows are not too
>>large.

 So this is pretty much the documented worst case for WITHOUT ROWID 
 tables.

 If the query is executed by making a copy of all the table data
>into
 a temporary B-tree (index), which is what INTERSECT and probably
>>IN(...)
 are doing, then the WITHOUT-ROWID-ness of the table does not
>matter.


 Regards,
 Clemens
 ___
 sqlite-users mailing list
 sqlite-users@mailinglists.sqlite.org

>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>>___
>>sqlite-users mailing list
>>sqlite-users@mailinglists.sqlite.org
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>-- 
>Sent from my Android device with K-9 Mail. Please excuse my brevity.
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-11 Thread J. King
There's an extra word in the first paragraph of Section 4 of that document, by 
the way:

" The error logger callback has also proven useful in catching errors 
occasional errors that the application misses..."

On September 11, 2017 11:22:50 AM EDT, Dan Kennedy  
wrote:
>On 09/10/2017 08:30 PM, R Smith wrote:
>> Well yes but the documentation suggests that one could expect a
>slight 
>> degradation. The words "works best with" does not seem to imbue an 
>> idea of "give WITHOUT ROWID tables a wide berth when your tables are 
>> more than few columns wide", and I don't think the Devs intended that
>
>> either.
>>
>> I can still roll with the idea that the WITHOUT ROWID tables with 151
>
>> columns and lots of data actually performed FASTER than the same
>ROWID 
>> tables for the other queries but slower with the JOIN... but that
>much 
>> slower?
>
>
>
>The difference seems to be that, currently, SQLite never creates 
>automatic indexes on WITHOUT ROWID tables.
>
>   http://sqlite.org/optoverview.html#autoindex
>
>I don't think there is a good reason that it cannot do so. It just 
>doesn't yet.
>
>Dan.
>
>
>
>
>
>>
>> I mean we are not talking a "little" slower, we are talking 50ms vs. 
>> ~70,000ms, that's a difference of a whopping 3 orders of magnitude
>and 
>> change. And we are not talking huge tables, another query that simply
>
>> walks the tables can do so several million times in the same time the
>
>> JOIN query does.
>>
>> Put another way, I can create 100 new tables and populate them each 
>> with the rows from one test table, then delete the rows from each new
>
>> table that doesn't satisfy an EXISTS() check in the other test table 
>> (essentially mimicking the JOIN query) and then output each full new 
>> table, 100 of them in turn, and then DROP them all. SQlite can do ALL
>
>> of that in a fraction of the time that the normal JOIN query (between
>
>> those same two test tables) takes to complete.
>>
>> We are talking a formula 1 car suddenly going at max speed of 1 
>> mile-per-weekend, barely keeping up with a semi-athletic snail, and 
>> people suggest checking the fuel octane rating. I'm saying there is 
>> something wrong under the hood.
>>
>> Cheers,
>> Ryan
>>
>>
>> On 2017/09/10 11:28 AM, Clemens Ladisch wrote:
>>> R Smith wrote:
 I am using 151 columns for both tests. The only thing that changes
 between the two scripts are the words "WITHOUT ROWID" being added
>>>  says:
>>> | WITHOUT ROWID tables will work correctly ... for tables with a
>single
>>> | INTEGER PRIMARY KEY. However, ordinary rowid tables will run
>faster in
>>> | that case.
>>> and:
>>> | WITHOUT ROWID tables work best when individual rows are not too
>large.
>>>
>>> So this is pretty much the documented worst case for WITHOUT ROWID 
>>> tables.
>>>
>>> If the query is executed by making a copy of all the table data into
>>> a temporary B-tree (index), which is what INTERSECT and probably
>IN(...)
>>> are doing, then the WITHOUT-ROWID-ness of the table does not matter.
>>>
>>>
>>> Regards,
>>> Clemens
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table-valued function crashing in SQLite 3.20

2017-09-11 Thread Richard Hipp
On 9/11/17, Jens Alfke  wrote:
> We have a virtual table serving as a table-valued function, which has worked
> fine up through SQLite 3.19, but now in 3.20.1 is crashing whenever it’s
> used.

Does it work on trunk?


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table-valued function crashing in SQLite 3.20

2017-09-11 Thread Jens Alfke
For what it’s worth, the instructions generated in 3.19.2 (which don't crash) 
are slightly different: the OpenRead and Rewind instructions are at the top, 
not down at location 15 —

0   Init0   24  0   00  Start at 24
1   OpenRead0   4   0   5   00  root=4 
iDb=0; kv_default
2   Rewind  0   23  0   00  
3   Integer 0   2   0   00  r[2]=0; Init 
EXISTS result
4   Integer 1   3   0   00  r[3]=1; LIMIT 
counter
5   VOpen   1   0   0   vtab:10128F730  00  
6   Column  0   4   6   00  
r[6]=kv_default.body
7   String8 0   7   0   likes   00  r[7]='likes'
8   Integer 2   4   0   00  r[4]=2
9   Integer 2   5   0   00  r[5]=2
10  VFilter 1   16  4   00  iplan=r[4] 
zplan=''
11  VColumn 1   1   8   00  
r[8]=vcolumn(1); fl_each.value
12  Ne  9   15  8   (BINARY)51  if 
r[8]!=r[9] goto 15
13  Integer 1   2   0   00  r[2]=1
14  DecrJumpZero3   16  0   00  if 
(--r[3])==0 goto 16
15  VNext   1   11  0   00  
16  IfNot   2   22  1   00  
17  Column  0   1   11  00  
r[11]=kv_default.sequence
18  Column  0   0   12  00  
r[12]=kv_default.key
19  Column  0   3   13  00  
r[13]=kv_default.version
20  Column  0   2   14  0   00  
r[14]=kv_default.flags
21  ResultRow   11  4   0   00  
output=r[11..14]
22  Next0   3   0   01  
23  Halt0   0   0   00  
24  Transaction 0   0   4   0   01  
usesStmtJournal=0
25  String8 0   9   0   climbing00  
r[9]='climbing'
26  Goto0   1   0   00  
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Table-valued function crashing in SQLite 3.20

2017-09-11 Thread Jens Alfke
We have a virtual table serving as a table-valued function, which has worked 
fine up through SQLite 3.19, but now in 3.20.1 is crashing whenever it’s used. 
I know there were changes in 3.20 relating to native functions taking pointers, 
but this doesn’t seem to be related to that. Were there any other 
possibly-breaking changes to the way virtual tables work? I didn’t see any in 
the release notes.

As virtual tables go it’s pretty simple. It was shamelessly copied from the 
json_each implementation in json1.c, then adapted from there, and it’s still 
quite similar (it’s just that the data collection it’s iterating over is not in 
JSON format.)

What I’m seeing is that our fl_each's  xConnect and xBestIndex are called at 
compile/prepare time, then xOpen is called during the first call to 
sqlite3_step, and immediately after that comes a crash in 
sqlite3VdbeCursorMoveto() because the passed-in VdbeCursor is NULL. (Actually 
pp is a valid pointer, but *pp is NULL.) This is right at the start of handling 
the next opcode, Column.

Top of the backtrace is:
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
(code=1, address=0x0)
frame #0: sqlite3VdbeCursorMoveto(pp=0x7ffeefbfb028, 
piCol=0x7ffeefbfb034) at sqlite3.c:75132
  * frame #1: sqlite3VdbeExec(p=0x000101171680) at sqlite3.c:81369
frame #2: sqlite3Step(p=0x000101171680) at sqlite3.c:77352
frame #3: sqlite3_step(pStmt=0x000101171680) at sqlite3.c:77416

The SQL query looks like:
SELECT sequence, key, version, flags 
FROM kv_default 
WHERE (EXISTS (SELECT 1 FROM fl_each(body, 'likes') AS _like WHERE 
_like.value = 'climbing’))

EXPLAIN returns:
0   Init0   24  0   00  Start at 24
1   Integer 0   2   0   00  r[2]=0; Init 
EXISTS result
2   Integer 1   3   0   00  r[3]=1; LIMIT 
counter
3   VOpen   1   0   0   vtab:10116FC00  00  
4   Column  0   4   6   00  
r[6]=kv_default.body<==CRASH AT THIS OPCODE
5   String8 0   7   0   likes   00  r[7]='likes'
6   Integer 2   4   0   00  r[4]=2
7   Integer 2   5   0   00  r[5]=2
8   VFilter 1   14  4   00  iplan=r[4] 
zplan=''
9   VColumn 1   1   8   00  
r[8]=vcolumn(1); fl_each.value
10  Ne  9   13  8   (BINARY)51  if 
r[8]!=r[9] goto 13
11  Integer 1   2   0   00  r[2]=1
12  DecrJumpZero3   14  0   00  if 
(--r[3])==0 goto 14
13  VNext   1   9   0   00  
14  IfNot   2   23  1   00  
15  OpenRead0   4   0   5   00  root=4 
iDb=0; kv_default
16  Rewind  0   23  0   00  
17  Column  0   1   11  00  
r[11]=kv_default.sequence
18  Column  0   0   12  00  
r[12]=kv_default.key
19  Column  0   3   13  00  
r[13]=kv_default.version
20  Column  0   2   14  0   00  
r[14]=kv_default.flags
21  ResultRow   11  4   0   00  
output=r[11..14]
22  Next0   17  0   01  
23  Halt0   0   0   00  
24  Transaction 0   0   4   0   01  
usesStmtJournal=0
25  String8 0   9   0   climbing00  
r[9]='climbing'
26  Goto0   1   0   00  

(This is on Mac OS; I’m on the current beta of 10.13, though that shouldn’t 
make a difference.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] PRAGMA journal_size_limit prevents -wal growth or only size outside a transaction?

2017-09-11 Thread Howard Kapustein
Looking through the source a bit I see sqlite3_busy_timeout is a no-fail 
operation (aside from misuse) so sqlite3_busy_timeout(0); 
SQLITE_CHECKPOINT_TRUNCATEsqlite3_busy_timeout(n); does what I need. Thanks 
all. 

- Howard

P.S. sqlite3_close intentionally doesn't respect journal_size_limit. From 
sqlite3WalClose

}else if( pWal->mxWalSize>=0 ){
  /* Try to truncate the WAL file to zero bytes if the checkpoint
  ** completed and fsynced (rc==SQLITE_OK) and we are in persistent
  ** WAL mode (bPersist) and if the PRAGMA journal_size_limit is a
  ** non-negative value (pWal->mxWalSize>=0).  Note that we truncate
  ** to zero bytes as truncating to the journal_size_limit might
  ** leave a corrupt WAL file on disk. */
  walLimitSize(pWal, 0);
} 

That last comment "Note that we truncate..." is interesting. When would this 
corrupt WAL rear its head?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Proof that a line has been modified

2017-09-11 Thread Paxdo


I just got all the answers! I must have an email problem.
I'm going to look at all of this, thank you very much everyone!


Dominique Devienne 
8 septembre 2017 à 11:41
On Fri, Sep 8, 2017 at 12:29 AM, Nico Williams 

See also https://sqlite.org/dbhash.html#overview --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Nico Williams 
8 septembre 2017 à 00:29

You have these choices:

- hash the whole SQLite3 file and record or sign such hash values for
approved DB files (this will let you detect all changes)

- something like what you described (hash every row of every table and
bind them all somehow, then save or sign this; this too will let you
detect all changes for all tables that you apply this to)

(this is DB-agnostic)

- switch to a DB that uses a Merkle hash tree (see below)

Generally speaking, the best approach for this sort of thing is to use
something called a Merkle Hash Tree, such that for each database/
filesystem/datastore you always have available a single, small (e.g.,
256 bits) cryptographic hash value for the entire thing.

In order to make such hash values usable for this purpose you'll need
the system to be "content-addressed" storage (CAS) if at all possible
(more on that in below).

A good example of a system that comes close to this is ZFS. ZFS is a
filesystem that actually is a Merkle Hash Tree on-disk, but it's not
content-addressed, which means that if the locations of data on-disk
changes, then the root hash also changes, even if none of the actual
data changed.

What this actually means in practice is that any time you have a
"pointer" from a database/filesystem/datastore page to another, what
must actually be stored is not just the address of the target page, but
the hash of its contents. If you apply this rule rigorously, and if you
have a root page (filesystems generally do, and databases can too, at
least per-table, and often for the entire DB), you necessarily end up
with a root page whose cryptographic hash *is* the cryptographic hash of
the entire DB/FS.

To get CAS you also need to not include block/page addresses in the
cryptographic hash computations (but still must include the hashes of
pointed-to pages/blocks). Then you can say that the hash of a page's
content *is* its address (it's not really).

One reason that Merkle hash trees are best is that you don't have to
read every page of a DB/FS to verify the root hash. You need only hash
the root page and you're done -- if, anyways, any errors verifying other
page hashes can be handled at run-time. Another is that they let you
compute hashes for sub-trees. Another is that they're easy to build.

SQLite3 is NOT a Merkle hash tree, however. You *can* build a Merkle
hash tree with SQLite3 though. Fossil is a version control system that
does exactly that, but that's not a technique you're likely to apply to
your use case (I'm guessing).

Given a Merkley hash tree, you can digitally sign (or save in a remote,
secure system) root hash values of approved DB/FS states. This is
interesting, for example, for secure-boot/TPM applications.

Given that you can't easily use a Merkle hash tree with SQLite3 this
without building a DB on top of a DB (like Fossil basically does) or
switching to one that uses a Merkle hash tree (and exposes the root hash
value to you), you could hash every row, XOR the hash values (since
there's no defined order for the rows, or else you can hash the
concatenation of the hashes in some order you define), and sign that.
You could apply this for every table and XOR all the table hashes, or
just those tables that are of interest to you. You'll want to do this
for all interesting rows in sqlite_master as well.

Lastly, as others have pointed out, the best you can do with a DB hash
is cryptographically prove that the FS/DB has approved content, for some
value of "approved content".

You cannot prove that the DB/FS hasn't been reset to an earlier approved
state without adding a revocation system.

Nor can you prove that the DB/FS has no malicious content in it -- only
that an approved entity signed it as "approved".

Nico
Paxdo 
7 septembre 2017 à 10:16

Hi all!

For security reasons, a customer wants to be sure that a database line 
cannot be modified after its initial insertion (or unmodified without 
being visible, with proof that the line has been modified). Including 
by technicians who can open the database (SQLITE of course).


Is there a solution to that?

I thought of a hash calculated and recorded only when the line was 
inserted (calculated on all columns of the line, and stored in a 
column of that line).


Or a trigger that would prevent any modification.

But these solutions can be easily bypassed, right?

Do you have any advice?

Thank you!

Tom
(Sorry for my 

Re: [sqlite] compiling for android

2017-09-11 Thread Christian Czech

Hi,

it guess it must be some special settings. We used sqlite quite some 
time now and the last build problem we got with ndk is quite long time ago.


Regards,
Christian

---
derago e. K., Gartenstr. 4, D-79807 Lottstetten, Amtsgericht Freiburg/Br. 
HRA621012, weitere Informationen unter http://www.derago.com/Impressum. Diese 
E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn 
Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten 
haben, informieren Sie bitte sofort den Absender und vernichten Sie diese 
E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail 
ist nicht gestattet.

Am 11.09.2017 um 17:28 schrieb Dan Kennedy:

On 09/11/2017 05:20 AM, Mark Sibly wrote:

Hi,

I just encountered a problem trying to build the sqlite amalgamation
package for android with the latest NDK (v15.2.4203891) on windows 10.

The compiler couldn't find mmap, which was causing a ton of errors.

I solved it by sticking this at the top-ish of sqlite3.c

#if __ANDROID__
#include 
#endif


Thanks for reporting this. Which version are you trying to build?

Are you building with any special options - SQLITE_OMIT_WAL or similar?

When things are working right, sys/mman.h should be included if 
required by the amalgamation equivalent of this line:


  http://www.sqlite.org/src/artifact/489aa972ccc34?ln=99

Can you post the first couple of compiler errors?

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] compiling for android

2017-09-11 Thread Dan Kennedy

On 09/11/2017 05:20 AM, Mark Sibly wrote:

Hi,

I just encountered a problem trying to build the sqlite amalgamation
package for android with the latest NDK (v15.2.4203891) on windows 10.

The compiler couldn't find mmap, which was causing a ton of errors.

I solved it by sticking this at the top-ish of sqlite3.c

#if __ANDROID__
#include 
#endif


Thanks for reporting this. Which version are you trying to build?

Are you building with any special options - SQLITE_OMIT_WAL or similar?

When things are working right, sys/mman.h should be included if required 
by the amalgamation equivalent of this line:


  http://www.sqlite.org/src/artifact/489aa972ccc34?ln=99

Can you post the first couple of compiler errors?

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JOIN vs. INTERSECT vs. WHERE IN (...) - speed and efficiency differences

2017-09-11 Thread Dan Kennedy

On 09/10/2017 08:30 PM, R Smith wrote:
Well yes but the documentation suggests that one could expect a slight 
degradation. The words "works best with" does not seem to imbue an 
idea of "give WITHOUT ROWID tables a wide berth when your tables are 
more than few columns wide", and I don't think the Devs intended that 
either.


I can still roll with the idea that the WITHOUT ROWID tables with 151 
columns and lots of data actually performed FASTER than the same ROWID 
tables for the other queries but slower with the JOIN... but that much 
slower?




The difference seems to be that, currently, SQLite never creates 
automatic indexes on WITHOUT ROWID tables.


  http://sqlite.org/optoverview.html#autoindex

I don't think there is a good reason that it cannot do so. It just 
doesn't yet.


Dan.







I mean we are not talking a "little" slower, we are talking 50ms vs. 
~70,000ms, that's a difference of a whopping 3 orders of magnitude and 
change. And we are not talking huge tables, another query that simply 
walks the tables can do so several million times in the same time the 
JOIN query does.


Put another way, I can create 100 new tables and populate them each 
with the rows from one test table, then delete the rows from each new 
table that doesn't satisfy an EXISTS() check in the other test table 
(essentially mimicking the JOIN query) and then output each full new 
table, 100 of them in turn, and then DROP them all. SQlite can do ALL 
of that in a fraction of the time that the normal JOIN query (between 
those same two test tables) takes to complete.


We are talking a formula 1 car suddenly going at max speed of 1 
mile-per-weekend, barely keeping up with a semi-athletic snail, and 
people suggest checking the fuel octane rating. I'm saying there is 
something wrong under the hood.


Cheers,
Ryan


On 2017/09/10 11:28 AM, Clemens Ladisch wrote:

R Smith wrote:

I am using 151 columns for both tests. The only thing that changes
between the two scripts are the words "WITHOUT ROWID" being added

 says:
| WITHOUT ROWID tables will work correctly ... for tables with a single
| INTEGER PRIMARY KEY. However, ordinary rowid tables will run faster in
| that case.
and:
| WITHOUT ROWID tables work best when individual rows are not too large.

So this is pretty much the documented worst case for WITHOUT ROWID 
tables.


If the query is executed by making a copy of all the table data into
a temporary B-tree (index), which is what INTERSECT and probably IN(...)
are doing, then the WITHOUT-ROWID-ness of the table does not matter.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Intel 17

2017-09-11 Thread Dominique Devienne
On Mon, Sep 11, 2017 at 4:49 PM, Richard Hipp  wrote:

> On 9/11/17, Dominique Devienne  wrote:
> > FYI, here are the modifications we did to a 3.19.3 amalgamation to build
> > with the Intel 17 compiler. Note that we have a few modifs of our own, so
> > line numbers might be off. --DD
> >
> > PS: I hope the formatting goes through. Was garbled when I converted to
> > plain text, so trying in rich-text, sorry.
>
> The "rich text" is garbled and unreadable.
>

OK, sorry. Here's another try "hand-edited". Lines should be easy enough to
find hopefully.

= 1st error =
sqlite3.c(17654): error #265: floating-point operation result is out of
range

// around line 17644
- result = 1e308*1e308*s; /* Infinity */
+ result = 1e308*(1e308*s); /* Infinity */

= 2nd error =

sqlite3.c.o:sqlite3.c:function sqlite3VdbeExec.h: error: undefined
reference to '__builtin_add_overflow', etc...

  SQLITE_PRIVATE int sqlite3AddInt64(i64 *pA, i64 iB){
- #if GCC_VERSION>=5004000
+#if GCC_VERSION>=5004000 && !defined(__INTEL_COMPILER)
  return __builtin_add_overflow(*pA, iB, pA);
  #else

  SQLITE_PRIVATE int sqlite3SubInt64(i64 *pA, i64 iB){
- #if GCC_VERSION>=5004000
+#if GCC_VERSION>=5004000 && !defined(__INTEL_COMPILER)
  return __builtin_sub_overflow(*pA, iB, pA);
  #else

  SQLITE_PRIVATE int sqlite3MulInt64(i64 *pA, i64 iB){
- #if GCC_VERSION>=5004000
+ #if GCC_VERSION>=5004000 && !defined(__INTEL_COMPILER)
  return __builtin_mul_overflow(*pA, iB, pA);
  #else

Please note that I'm not saying this is the best fix.
A colleague made those changes to make it build.
It wasn't necessary with Intel 15. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Intel 17

2017-09-11 Thread Richard Hipp
On 9/11/17, Dominique Devienne  wrote:
> FYI, here are the modifications we did to a 3.19.3 amalgamation to build
> with the Intel 17 compiler. Note that we have a few modifs of our own, so
> line numbers might be off. --DD
>
> PS: I hope the formatting goes through. Was garbled when I converted to
> plain text, so trying in rich-text, sorry.

The "rich text" is garbled and unreadable.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Performance impact of UPDATEing multiple columns vs few columns

2017-09-11 Thread Hick Gunter
If you have 4 set clauses, the those 4 fields will be updated with the same 
values for all the rows matching the where clause with just a single call to 
sqlite3_step().

If you actually want to update only 1 of the fields in 1 record, then you must 
bind the current values (which are unknown and need to be determined first) of 
the other fields and make sure that your where clause matches only the 1 
intended record.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von ghalwasi
Gesendet: Montag, 11. September 2017 15:02
An: sqlite-users@mailinglists.sqlite.org
Betreff: Re: [sqlite] [EXTERNAL] Performance impact of UPDATEing multiple 
columns vs few columns

hi Darko,


>If you're preparing that statement more than once then you are wasting
time, there's no reason whatsoever to do it. You're also wasting time if you 
make a bind call to set any column that hasn't changed since the last time you 
executed the statement.

>The entire row is rewritten when updating so the most efficient way to
>do
it is to call sqlite3_prepare_v2 once, then bind any columns that have changed, 
call sqlite3_step to execute the statement, then call sqlite3_reset to reuse 
the statement, then go back to binding any columns that have changed and repeat 
the other steps. sqlite3_reset does not clear any column bindings.

Thanks for the explanation. It makes sense.

Thanks
Gaurav



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Performance impact of UPDATEing multiple columns vs few columns

2017-09-11 Thread ghalwasi
hi Darko,


>If you're preparing that statement more than once then you are wasting
time, there's no reason whatsoever to do it. You're also wasting time if you
make a bind call to set any column that hasn't changed since the last time
you executed the statement. 

>The entire row is rewritten when updating so the most efficient way to do
it is to call sqlite3_prepare_v2 once, then bind any columns that have
changed, call sqlite3_step to execute the statement, then call sqlite3_reset
to reuse the statement, then go back to binding any columns that have
changed and repeat the other steps. sqlite3_reset does not clear any column
bindings.

Thanks for the explanation. It makes sense.

Thanks
Gaurav



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] compiling for android

2017-09-11 Thread Mark Sibly
Hi,

I just encountered a problem trying to build the sqlite amalgamation
package for android with the latest NDK (v15.2.4203891) on windows 10.

The compiler couldn't find mmap, which was causing a ton of errors.

I solved it by sticking this at the top-ish of sqlite3.c

#if __ANDROID__
#include 
#endif

Bye!
Mark
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Need help with SQL query

2017-09-11 Thread Kees Nuyt
On Mon, 11 Sep 2017 14:05:25 +0200, Kees Nuyt 
wrote:

> CREATE TABLE ATTRIBUTES (
>  ITEM_ID INTEGER REFERENCES ATTRIBUTES(ID) ON DELETE CASCADE,
>   KEY   TEXT,
>   VALUE TEXT,
>   PRIMARY KEY (ITEM_ID,KEY)
> ) WITHOUT ROWID;
> CREATE INDEX attr_item_id ON ATTRIBUTES(ITEM_ID);

Correction:
In this construct, it makes no sense to create the index
attr_item_id, because the ITEM_ID is the first column
of the primary key (which is indexed implicitly).

So, you can leave out the CREATE INDEX attr_item_id 
statement in this case.

-- 
Regards,

Kees Nuyt
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-11 Thread Eric Grange
> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE
DS = 15;".

Not zipvfs specific, but that kind of update can be quite inefficient if
the record is large (in terms of bytes, or records per database page) as
your table declaration hints to.

This will be especially prominent with compression, as a single byte
changed early in the uncompressed stream can lead to changing pretty much
every byte of the compressed data, and so is bound to defeat any
optimizations the storage could have to handle unmodified pages and data.

Depending on your update patterns and frequencies, it could thus be
beneficial to split your table into two tables (or more) with a common
primary key:
- one table for the rarely changed fields (even better if they are
immutable)
- one table (or more) for the fields that change often

This will increase the database size and reduce insert performance, and it
can complicate querying when you need everything (extra inner joins), but
it can help reduce the amount of raw data that is affected by updates quite
drastically.

I have been able to use the above strategy sucessfully, however it was
under Windows compressed folders, so YMMV.


On Thu, Sep 7, 2017 at 9:13 PM, Yue Wu  wrote:

> Hello,
> As mentioned in the subject, our goal is to improve performance regarding
> to batch sql updates.
> The update sql as follow,
>
> > UPDATE ITEM SET FIELD4 =? WHERE DS=?
>
> We run 100,000 updates in a single transaction. The zipvfs version takes
> about 20 min while uncompressed version takes about 7 min.
> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE DS
> = 15;".
>
> Any suggestions is very helpful.
>
> Some background:
> We have an app running on Android 4.2 using zipvfs.
> The page size is 4096
> The cache size is - 2000
>
> Table schema for ITEM
>
> > CREATE TABLE ITEM (
> > FIELD0 NUMERIC,
> > FIELD1 NUMERIC,
> > DS TEXT,
> > FIELD2 TEXT,
> > FIELD3 TEXT,
> > FIELD4 NUMERIC,
> > FIELD5 NUMERIC,
> > FIELD6 NUMERIC,
> > FIELD7 NUMERIC,
> > FIELD8 NUMERIC,
> > FIELD9 NUMERIC,
> > FIELD10 NUMERIC,
> > FIELD11 TEXT);
>
>
> The third column: "DS" is what we query by almost all the time. We also
> created index:
>
> > CREATE INDEX DS_INDEX ON ITEM(DS);
>
>
> There are 30 million records. Zipvfs size of 471mb vs uncompressed db (the
> one shipped with Android 4.2) size of 1.39gb.
> Zipvfs db using zlib and aes128, which are default.
>
> Thanks
> --
>
> Yue Wu |  Android Developer
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Need help with SQL query

2017-09-11 Thread Kees Nuyt
On Sun, 10 Sep 2017 20:35:16 -0700, Vikas Aditya
 wrote:

> Hi All,
>
> I need some help in figuring our right query syntax for querying
> items from two tables. We have two tables. One of the table has
> list of items. And Second table has additional attributes.

Adding to the suggestions of Ryan Smith, I would suggest a few
table definition optimizations:

> CREATE TABLE ITEM (
> ID INTEGER,
> FIELD0 TEXT
> FIELD1 TEXT,
> FIELD3 TEXT,
> FIELD4 TEXT
> );

By adding 'PRIMARY KEY to the ÍD column, it becomes an alias for
the internal ROWID column, saving space. Also, to reference a
parent table, the reference needs to point to a unique column. 
A primary key fulfills that requirement.

CREATE TABLE ITEM (
ID INTEGER PRIMARY KEY,
FIELD0 TEXT
FIELD1 TEXT,
FIELD3 TEXT,
FIELD4 TEXT
);

> CREATE TABLE ATTRIBUTES (
> ID INTEGER PRIMARY KEY,
> ITEM_ID INTEGER,
> KEY TEXT,
> VALUE TEXT
> );

Do the same for the primary key of the attributes table.
Add a foreign key constraint to formalize the relation
between the parent and child table. This provides 
"referential integrity".
The index will often speed up JOIN operations.

CREATE TABLE ATTRIBUTES (
  ID INTEGER PRIMARY KEY,
  ITEM_ID INTEGER REFERENCES ATTRIBUTES ON DELETE CASCADE,
  KEY   TEXT,
  VALUE TEXT
);
CREATE INDEX attr_item_id ON ATTRIBUTES(ITEM_ID);

Even better, the ID column can be left out, and a different
primary key will enforce that the same KEY can only be used once
for any ITEM_ID:
 
CREATE TABLE ATTRIBUTES (
  ITEM_ID INTEGER REFERENCES ATTRIBUTES(ID) ON DELETE CASCADE,
  KEY   TEXT,
  VALUE TEXT,
  PRIMARY KEY (ITEM_ID,KEY)
) WITHOUT ROWID;
CREATE INDEX attr_item_id ON ATTRIBUTES(ITEM_ID);

Note: 
Foreign key constraints are only enforced when 
   PRAGMA foreign_keys=on;
is executed after opening the database, before any
INSERT/UPDATE/DELETE statement is issued.

HTH

-- 
Regards,

Kees Nuyt
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting data from WITH clause is very slow

2017-09-11 Thread Frank Millman

On Mon Sep 11, 2017 at 11:15:22AM +0200, no...@null.net wrote:
> SELECT
> acc_no,
> acc_name,
> SUM(i_90.invoice_bal) AS 90_days,
> SUM(i_current.invoice_bal) AS current
> FROM
>debtors_table
> LEFT JOIN
> invoices i_90
> ON
> i_90.debtor_id = debtors_table.id AND
> i_90.invoice_date > date_5 AND
> i_90.invoice_date <= date_4
> LEFT JOIN
> invoices i_current
> ON
> i_current.debtor_id = debtors_table.id AND
> i_current.invoice_date > date_2 AND
> i_current.invoice_date <= date_1
> 
> I should also mention for completeness the need for:
> 
> GROUP BY
> acc_no,
> acc_name
> 
> ORDER BY
> ...


Very interesting ideas. I have a lot of experimenting to do!

Thanks very much, Mark.

Frank
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting data from WITH clause is very slow

2017-09-11 Thread nomad
On Mon Sep 11, 2017 at 11:15:22AM +0200, no...@null.net wrote:
> SELECT
> acc_no,
> acc_name,
> SUM(i_90.invoice_bal) AS 90_days,
> SUM(i_current.invoice_bal) AS current
> FROM
>debtors_table
> LEFT JOIN
> invoices i_90
> ON
> i_90.debtor_id = debtors_table.id AND
> i_90.invoice_date > date_5 AND
> i_90.invoice_date <= date_4
> LEFT JOIN
> invoices i_current
> ON
> i_current.debtor_id = debtors_table.id AND
> i_current.invoice_date > date_2 AND
> i_current.invoice_date <= date_1

I should also mention for completeness the need for:

GROUP BY
acc_no,
acc_name

> ORDER BY
> ...

-- 
Mark Lawrence
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting data from WITH clause is very slow

2017-09-11 Thread nomad
On Mon Sep 11, 2017 at 08:40:41AM +0200, Frank Millman wrote:
> I could structure it like this (pseudo code) -
> 
> SELECT acc_no, acc_name,
> (SELECT SUM(invoice_bal) FROM invoices
>   WHERE invoice_date <= date_5) AS 120_days, 
> (SELECT SUM(invoice_bal) FROM invoices
>   WHERE invoice_date > date_5 AND invoice_date <= date_4) AS 90_days, 
> (SELECT SUM(invoice_bal) FROM invoices
>   WHERE invoice_date > date_4 AND invoice_date <= date_3) AS 60_days, 
> (SELECT SUM(invoice_bal) FROM invoices
>   WHERE invoice_date > date_3 AND invoice_date <= date_2) AS 30_days, 
> (SELECT SUM(invoice_bal) FROM invoices
>   WHERE invoice_date > date_2 AND invoice_date <= date_1) AS current
> FROM debtors_table
> 
> This works, but it requires 5 separate scans of the invoice table,
> which is inefficient.

You can achieve the same result with a single table scan using CASE
statements:

SELECT
acc_no,
acc_name,
SUM(
CASE WHEN
   invoice_date <= date_5
THEN
invoice_bal
ELSE
0
END
) AS 120_days, 
SUM(
CASE WHEN
invoice_date > date_5 AND invoice_date <= date_4
THEN
invoice_bal
ELSE
0
END
) AS 90_days, 
 
-- ...
 
SUM(
CASE WHEN
invoice_date > date_2 AND invoice_date <= date_1
THEN
invoice_bal
ELSE
0
END
) AS current
FROM
   debtors_table
LEFT JOIN
invoices
ON
invoices.debtor_id = debtors_table.id

If your time periods are not exhaustive (covering all time) then you
can perhaps be more efficient again by moving the filtering into the
join clause:

SELECT
acc_no,
acc_name,
SUM(i_90.invoice_bal) AS 90_days,
SUM(i_current.invoice_bal) AS current
FROM
   debtors_table
LEFT JOIN
invoices i_90
ON
i_90.debtor_id = debtors_table.id AND
i_90.invoice_date > date_5 AND
i_90.invoice_date <= date_4
LEFT JOIN
invoices i_current
ON
i_current.debtor_id = debtors_table.id AND
i_current.invoice_date > date_2 AND
i_current.invoice_date <= date_1
ORDER BY
...

For the above an index on invoices(debtor_id,invoice_date) would be
appropriate.

Also in case you weren't aware, the SUM() function in my second example
could produce NULL values. You may be better served in that instance by
CAST(TOTAL(i_current.invoice_bal) AS INTEGER).

-- 
Mark Lawrence
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Intel 17

2017-09-11 Thread Dominique Devienne
FYI, here are the modifications we did to a 3.19.3 amalgamation to build
with the Intel 17 compiler. Note that we have a few modifs of our own, so
line numbers might be off. --DD

PS: I hope the formatting goes through. Was garbled when I converted to
plain text, so trying in rich-text, sorry.


sqlite3.c
(17654):
​​error ​​#265: ​​floating-point ​​operation ​​result ​​is ​​out ​​of
​​range

sqlite3.c
.o:
sqlite3.c
:function
sqlite3VdbeExec.h: error: undefined reference to '__builtin_add_overflow'

#define SQLITE_VERSION "3.19.3" #define SQLITE_VERSION_NUMBER 3019003
#define SQLITE_SOURCE_ID "2017-06-08 14:26:16
0ee482a1e0eae22e08edc8978c9733a96603d4509645f348ebf55b579e89636b"

---
//epos/trunk/library/SharedComponents/src/lib/pdgm/sqlite/core/sqlite3.c
+++
//epos/trunk/library/SharedComponents/src/lib/pdgm/sqlite/core/sqlite3.c @@
-17644,21 +17644,21 @@ result = s / scale; result /= 1.0e+308; }else{
result = s * scale; result *= 1.0e+308; } }else{ assert( e>=342 ); if(
esign<0 ){ result = 0.0*s; }else{ - result = 1e308*1e308*s; /* Infinity */
+ result = 1e308*(1e308*s); /* Infinity */ } } }else{ /* 1.0e+22 is the
largest power of 10 than can be ** represented exactly. */ while( e%22 ) {
scale *= 1.0e+1; e -= 1; } while( e>0 ) { scale *= 1.0e+22; e -= 22; } if(
esign<0 ){ result = s / scale; }else{ @@ -18429,56 +18429,56 @@ } } /* **
Attempt to add, substract, or multiply the 64-bit signed value iB against
** the other 64-bit signed integer at *pA and store the result in *pA. **
Return 0 on success. Or if the operation would have resulted in an **
overflow, leave *pA unchanged and return 1. */ SQLITE_PRIVATE int
sqlite3AddInt64(i64 *pA, i64 iB){ -#if GCC_VERSION>=5004000 +#if
GCC_VERSION>=5004000 && !defined(__INTEL_COMPILER) return
__builtin_add_overflow(*pA, iB, pA); #else i64 iA = *pA; testcase( iA==0 );
testcase( iA==1 ); testcase( iB==-1 ); testcase( iB==0 ); if( iB>=0 ){
testcase( iA>0 && LARGEST_INT64 - iA == iB ); testcase( iA>0 &&
LARGEST_INT64 - iA == iB - 1 ); if( iA>0 && LARGEST_INT64 - iA < iB )
return 1; }else{ testcase( iA<0 && -(iA + LARGEST_INT64) == iB + 1 );
testcase( iA<0 && -(iA + LARGEST_INT64) == iB + 2 ); if( iA<0 && -(iA +
LARGEST_INT64) > iB + 1 ) return 1; } *pA += iB; return 0; #endif }
SQLITE_PRIVATE int sqlite3SubInt64(i64 *pA, i64 iB){ -#if
GCC_VERSION>=5004000 +#if GCC_VERSION>=5004000 &&
!defined(__INTEL_COMPILER) return __builtin_sub_overflow(*pA, iB, pA);
#else testcase( iB==SMALLEST_INT64+1 ); if( iB==SMALLEST_INT64 ){ testcase(
(*pA)==(-1) ); testcase( (*pA)==0 ); if( (*pA)>=0 ) return 1; *pA -= iB;
return 0; }else{ return sqlite3AddInt64(pA, -iB); } #endif } SQLITE_PRIVATE
int sqlite3MulInt64(i64 *pA, i64 iB){ -#if GCC_VERSION>=5004000 +#if
GCC_VERSION>=5004000 && !defined(__INTEL_COMPILER) return
__builtin_mul_overflow(*pA, iB, pA); #else i64 iA = *pA; if( iB>0 ){ if(
iA>LARGEST_INT64/iB ) return 1; if( iA0 ){ if( iBhttp://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting data from WITH clause is very slow

2017-09-11 Thread Frank Millman
Clemens Ladisch wrote:

> Frank Millman wrote:

> > I changed it to use a WITH clause to make one scan of the invoice table and 
> > store the results.

> The WITH clause itself does not store anything; it's syntactic sugar [...] To 
> force the ageing results to be stored, you'd have to create a temporary table.

Thanks – I did not know that. It explains everything!
Frank

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting data from WITH clause is very slow

2017-09-11 Thread Clemens Ladisch
Frank Millman wrote:
> SELECT acc_no, acc_name,
> (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date <= date_5) 
> AS 120_days,
> (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_5 
> AND invoice_date <= date_4) AS 90_days,
> (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_4 
> AND invoice_date <= date_3) AS 60_days,
> (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_3 
> AND invoice_date <= date_2) AS 30_days,
> (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_2 
> AND invoice_date <= date_1) AS current
> FROM debtors_table

This query does not make sense without a GROUP BY and a correlation,
which would affect how to optimize it.

> This works, but it requires 5 separate scans of the invoice table,
> which is inefficient.

If there is an index on the invoice_date, it's efficient.

Assuming that there is an account_numer filter you did not show, the
index should look like this:

  CREATE INDEX invoices_accno_date_idx ON invoices(account_number, 
invoice_date);

Alternatively, to create a covering index 
(http://www.sqlite.org/queryplanner.html#covidx)
for this query, add the invoice_bal column.

> I changed it to use a WITH clause to make one scan of the invoice
> table and store the results.

The WITH clause itself does not store anything; it's syntactic
sugar.

> WITH ageing AS (...)
> SELECT acc_no, acc_name,
> (SELECT balance FROM ageing WHERE account_number = acc_no AND 
> bucket_number = 5) AS 120_days,
> (SELECT balance FROM ageing WHERE account_number = acc_no AND 
> bucket_number = 4) AS 90_days,
> (SELECT balance FROM ageing WHERE account_number = acc_no AND 
> bucket_number = 3) AS 60_days,
> (SELECT balance FROM ageing WHERE account_number = acc_no AND 
> bucket_number = 2) AS 30_days,
> (SELECT balance FROM ageing WHERE account_number = acc_no AND 
> bucket_number = 1) AS current
> FROM debtors_table
>
> It works, but it takes about 3.5 seconds to execute.

The ageing query is likely to be executed five times for each row in the
outer query.  (This might be efficient with the above the index.)

To force the ageing results to be stored, you'd have to create
a temporary table.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Selecting data from WITH clause is very slow

2017-09-11 Thread Frank Millman
Hi all

I am writing an accounting system, and want to select ‘aged balances’ from the 
debtors table.

To do this, I need to assign each outstanding invoice to an ageing ‘bucket’, 
where each bucket contains the sum of invoices where date > start_date and date 
<= end_date. There will be five buckets altogether. In my test database I have 
12 debtors and a few thousand invoices.

I could structure it like this (pseudo code) -

SELECT acc_no, acc_name,
(SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date <= date_5) AS 
120_days, 
(SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_5 AND 
invoice_date <= date_4) AS 90_days, 
(SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_4 AND 
invoice_date <= date_3) AS 60_days, 
(SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_3 AND 
invoice_date <= date_2) AS 30_days, 
(SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_2 AND 
invoice_date <= date_1) AS current
FROM debtors_table

This works, but it requires 5 separate scans of the invoice table, which is 
inefficient.

I changed it to use a WITH clause to make one scan of the invoice table and 
store the results.

I won’t show the WITH clause here, as it is not the reason for the question, 
and would just add clutter. If I run the WITH clause separately, it executes in 
about 0.15 seconds, and generates 60 rows with 3 columns – account number, 
bucket number, and accumulated balance.

In the main clause I now have this -

WITH ageing AS (...)
SELECT acc_no, acc_name,
(SELECT balance FROM ageing WHERE account_number = acc_no AND 
bucket_number = 5) AS 120_days,
(SELECT balance FROM ageing WHERE account_number = acc_no AND 
bucket_number = 4) AS 90_days,
(SELECT balance FROM ageing WHERE account_number = acc_no AND 
bucket_number = 3) AS 60_days,
(SELECT balance FROM ageing WHERE account_number = acc_no AND 
bucket_number = 2) AS 30_days,
(SELECT balance FROM ageing WHERE account_number = acc_no AND 
bucket_number = 1) AS current
FROM debtors_table

It works, but it takes about 3.5 seconds to execute.

Exactly the same exercise on Sql Server takes 0.17 seconds, and on PostgreSQL 
0.22 seconds.

I appreciate that the temporary table created in the WITH clause is not 
indexed, but 3.5 seconds seems a long time to extract the data from 60 rows.

Is there any way to speed this up?

Frank Millman
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users