subject:"\[sqlite\] Query preperation time does not scale linearly with growth of no. of tables"

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-12 Thread Harmen de Jong - CoachR Group B . V .

From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Clemens Ladisch [clem...@ladisch.de]
Sent: Wednesday, September 11, 2013 18:57
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Query preperation time does not scale linearly with 
growth of no. of tables

>Harmen de Jong - CoachR Group B.V. wrote:
>> http://www.coachrdevelopment.com/share/callstack_tree.html
>>
>> This shows most time is spend on sqlite3CodeRowTriggerDirect.
>
>I'd guess the actual culprit is the loop in getRowTrigger (which does
>not show up because it is inlined):
>
>  /* It may be that this trigger has already been coded (or is in the
>  ** process of being coded). If this is the case, then an entry with
>  ** a matching TriggerPrg.pTrigger field will be present somewhere
>  ** in the Parse.pTriggerPrg list. Search for such an entry.  */
>  for(pPrg=pRoot->pTriggerPrg;
>  pPrg && (pPrg->pTrigger!=pTrigger || pPrg->orconf!=orconf);
>  pPrg=pPrg->pNext
>  );

We have put a timer around this 'inline' function and indeed as you suggest 
this is causing a huge part of the 'overhead'. This specific code takes 45.28% 
of the total time. What it does is keeping a list (Parse::pTriggerPrg) of 
trigger programs that are already created and every time a trigger program is 
created, it checks this list to see if it is already created. Obviously this 
list becomes longer as the foreign keys are looped through.
Hereby our earlier assumption that the increasement was polynomial because of 
two nested loops seems to be wrong. So after improving this feature we still 
have to find about another 25% -;).
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-12 Thread Harmen de Jong - CoachR Group B . V .

On 12 sep. 2013, at 07:20, "James K. Lowden" 
mailto:jklow...@schemamania.org>> wrote:

On Tue, 10 Sep 2013 12:58:21 +
Harmen de Jong - CoachR Group B.V. 
mailto:har...@coachr.com>> wrote:

I think the way I wrote our timings were not that clear, since they
are definately exponentially. The numbers from my previous post refer
to the multiplier between the test cases. Just to make it clear, here
follows the same tests, but then expressed in msec of total time per
test.

500 tables - 10 msec in total
1000 tables - 25 msec in total
5000 tables - 298 msec in total
1 tables - 985 msec in total

I don't know what you mean by "exponentially".

   500  .020 ms/table
   1000 .025 ms/table
   5000  .0596 ms/table
   1 .0985 ms/table

Linearly, I'd say.  It may help to look at it graphically.

Well, actually it is neither of both (calling it exponentially was a mistake on 
my side). The increase in time is polynomial where we would expect an increase 
that is more or less linearly.

   http://www.schemamania.org/sqlite/graph.pdf

we cannot find anything in there that would explain an exponential
groth in time.

I doubt you will.

Well, the non-linear increase in time (polynomial increase) is there, so sooner 
or later we will find an explanation. In the mean time we already traced it 
down further by doing some profiling. You will find the result of this 
profiling (where time per function is expressed in percentage of the total time 
taken) here:

http://www.coachrdevelopment.com/share/callstack_tree.html

This shows most time is spend on sqlite3CodeRowTriggerDirect (the second one 
where it loops though FK's that point to B). However, now the question why this 
piece of code seems to be causing a polynomial increase still remains. Next 
question is if it can be improved for use cases with a large number of tables.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-11 Thread James K. Lowden

On Tue, 10 Sep 2013 12:58:21 +
Harmen de Jong - CoachR Group B.V.  wrote:

> I think the way I wrote our timings were not that clear, since they
> are definately exponentially. The numbers from my previous post refer
> to the multiplier between the test cases. Just to make it clear, here
> follows the same tests, but then expressed in msec of total time per
> test.
> 
> 500 tables - 10 msec in total
> 1000 tables - 25 msec in total
> 5000 tables - 298 msec in total
> 1 tables - 985 msec in total

I don't know what you mean by "exponentially".  

500  .020 ms/table
1000 .025 ms/table
5000  .0596 ms/table
1 .0985 ms/table

Linearly, I'd say.  It may help to look at it graphically.  

http://www.schemamania.org/sqlite/graph.pdf

> we cannot find anything in there that would explain an exponential
> groth in time.

I doubt you will.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-11 Thread Clemens Ladisch

Harmen de Jong - CoachR Group B.V. wrote:
> http://www.coachrdevelopment.com/share/callstack_tree.html
>
> This shows most time is spend on sqlite3CodeRowTriggerDirect.

I'd guess the actual culprit is the loop in getRowTrigger (which does
not show up because it is inlined):

  /* It may be that this trigger has already been coded (or is in the
  ** process of being coded). If this is the case, then an entry with
  ** a matching TriggerPrg.pTrigger field will be present somewhere
  ** in the Parse.pTriggerPrg list. Search for such an entry.  */
  for(pPrg=pRoot->pTriggerPrg;
  pPrg && (pPrg->pTrigger!=pTrigger || pPrg->orconf!=orconf);
  pPrg=pPrg->pNext
  );

Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-11 Thread Harmen de Jong - CoachR Group B . V .

To get rid of the question of WHERE exactly the time is consumed, we did some 
profiling on the application that run the query (using the 1 tables test 
DB).

As a result you will find an overview of time consumed per function (shown as 
percentage of the total time) at this link:

http://www.coachrdevelopment.com/share/callstack_tree.html

This shows most time is spend on sqlite3CodeRowTriggerDirect.

Now the question remains IF this function is causing the polynomial increase in 
time and if so, WHY it causes a polynomial increase in time and if there are 
any optimizations possible.

We don't see it yet. Looking forward to any suggestions.

Best regards,
Harmen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-11 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 21:24, "E.Pasma" 
mailto:pasm...@concepts.nl>> wrote:

My suppositions that the time was spent in the execute step and that this has 
been fixed in the new release appeared both wrong. Thus I may be wrong again 
but I think to have an explanation now.
It is as Simon guesses that a list of lists is being scanned. It is however not 
the contents of tables being scanned, but the list of foreign key constraints 
as is loaded in memory when a database is opened. When preparing a DELETE 
statement,  the global list of FK's is scanned to see if the current table is 
referred. This is the outer loop. If a referring FK is found and if this has an 
ON DELETE clause, comes an inner loop on the same global list to see if the 
referrer is referred to itself. In the case that every table has such a 
constraint, as is the case here, the time becomes n * n. If I'm right this is 
hard to fix and inherent to the representation of the database schema in memory.

This also means that if you leave out the cascading delete from the constraints 
the time becomes linear. Actually that is what I observed before coming with 
above explanation. This was easy to check by extractingg the schemas from the 
test databases and removing ON .. CASCADE. Thanks for making these database 
available.

To get rid of the question of WHERE exactly the time is consumed, we did some 
profiling on the application that run the query (using the 1 tables test 
DB).

As a result you will find an overview of time consumed per function (shown as 
percentage of the total time) at this link:

http://www.coachrdevelopment.com/share/callstack_tree.html

This shows most time is spend on sqlite3CodeRowTriggerDirect.

Now the question remains IF this function is causing the polynomial increase in 
time and if so, WHY it causes a polynomial increase in time and if there are 
any optimizations possible.

We don't see it yet. Looking forward to any suggestions.

Best regards,
Harmen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-11 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 21:24, "E.Pasma"  wrote:

> My suppositions that the time was spent in the execute step and that this has 
> been fixed in the new release appeared both wrong. Thus I may be wrong again 
> but I think to have an explanation now.
> It is as Simon guesses that a list of lists is being scanned. It is however 
> not the contents of tables being scanned, but the list of foreign key 
> constraints as is loaded in memory when a database is opened. When preparing 
> a DELETE statement,  the global list of FK's is scanned to see if the current 
> table is referred. This is the outer loop. If a referring FK is found and if 
> this has an ON DELETE clause, comes an inner loop on the same global list to 
> see if the referrer is referred to itself. In the case that every table has 
> such a constraint, as is the case here, the time becomes n * n. If I'm right 
> this is hard to fix and inherent to the representation of the database schema 
> in memory.
> 
> This also means that if you leave out the cascading delete from the 
> constraints the time becomes linear. Actually that is what I observed before 
> coming with above explanation. This was easy to check by extractingg the 
> schemas from the test databases and removing ON .. CASCADE. Thanks for 
> making these database available.

Your suggestion that when preparing a DELETE statement,  the global list of 
FK's is scanned to see if the current table is referred seems to be wrong for 
what we could find. 
>From sqlite3FkActions the method sqlite3FkReferences(pTab) is called; this 
>method uses a hash table (pTab->pSchema->fkeyHash) to know immediately which 
>FK's are referring to the given table, without having to loop through a global 
>list of FK's.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 21:24, "E.Pasma"  wrote:

> Op 10 sep 2013, om 19:48 heeft Simon Slavin het volgende geschreven:
> 
>> 
>> On 10 Sep 2013, at 4:15pm, Harmen de Jong - CoachR Group B.V. 
>>  wrote:
>> 
>>> That is something we suspected too. We already made some tests where we 
>>> timed the time needed for all memory allocations executed in the entire 
>>> operation. In total for the 1 tables test this was somewhere around 25 
>>> msec. Since this is just a little overhead and the instructions as you 
>>> point out have a linear increasement this still does not explain the 
>>> polynomial increasement in preperation time.
>> 
>> Polynomial time would be explained if the software is scanning every item in 
>> a list of lists for each table mentioned.  So a guess about what's happening 
>> is that for each DELETE FROM instruction the software has to look up the 
>> entry being deleted, and to check that none of the items in any of the other 
>> tables refers to it.
>> 
>> This makes sense given how FOREIGN KEYs are implemented in SQLite: there is 
>> no master 'check-for-changes' table, but each table which has the foreign 
>> key has to be scanned.  It is irrelevant that there are no entries in any of 
>> these tables: the table (actually, the index) still has to be found so that 
>> SQLite can figure out that there are no entries in it.  Because SQLite 
>> hashes table names, rather than scanning a simple list, it is scanning a 
>> list of lists.  Hence, polynomial time.
>> 
>> I see that E. Pasma has posted that the long scanning time has been fixed in 
>> a later release.  Good.
>> 
>> Simon
> My suppositions that the time was spent in the execute step and that this has 
> been fixed in the new release appeared both wrong. Thus I may be wrong again 
> but I think to have an explanation now.
> It is as Simon guesses that a list of lists is being scanned. It is however 
> not the contents of tables being scanned, but the list of foreign key 
> constraints as is loaded in memory when a database is opened. When preparing 
> a DELETE statement,  the global list of FK's is scanned to see if the current 
> table is referred. This is the outer loop. If a referring FK is found and if 
> this has an ON DELETE clause, comes an inner loop on the same global list to 
> see if the referrer is referred to itself. In the case that every table has 
> such a constraint, as is the case here, the time becomes n * n. If I'm right 
> this is hard to fix and inherent to the representation of the database schema 
> in memory.
> 
> This also means that if you leave out the cascading delete from the 
> constraints the time becomes linear. Actually that is what I observed before 
> coming with above explanation. This was easy to check by extractingg the 
> schemas from the test databases and removing ON .. CASCADE. Thanks for 
> making these database available.

What you are supposing makes sense. Actually we already have been searching for 
two loops that are related to each other since that would explain the 
polynomial, however thus far we cannot find them (hence we are not yet that 
familiar with the source code of SQLite). Could you point us to the code where 
these inner and outer loops are done?

The next challenge would of course be to find a solution. Actually we already 
found a similar optimization last week that had quite some impact with these 
many tables that we provided to the developers and have been added to the fork 
right now -;). Especially since these are quite exceptional use cases, there 
might be the chance to do some optimizations that seem useless for  the more 
normal usage of SQLite.

No thanks for making these databases available. We would like to thank every 
one of you that helps pointing us into the right direction in this thread.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread E.Pasma


Op 10 sep 2013, om 19:48 heeft Simon Slavin het volgende geschreven:



On 10 Sep 2013, at 4:15pm, Harmen de Jong - CoachR Group B.V. > wrote:


That is something we suspected too. We already made some tests  
where we timed the time needed for all memory allocations executed  
in the entire operation. In total for the 1 tables test this  
was somewhere around 25 msec. Since this is just a little overhead  
and the instructions as you point out have a linear increasement  
this still does not explain the polynomial increasement in  
preperation time.


Polynomial time would be explained if the software is scanning every  
item in a list of lists for each table mentioned.  So a guess about  
what's happening is that for each DELETE FROM instruction the  
software has to look up the entry being deleted, and to check that  
none of the items in any of the other tables refers to it.


This makes sense given how FOREIGN KEYs are implemented in SQLite:  
there is no master 'check-for-changes' table, but each table which  
has the foreign key has to be scanned.  It is irrelevant that there  
are no entries in any of these tables: the table (actually, the  
index) still has to be found so that SQLite can figure out that  
there are no entries in it.  Because SQLite hashes table names,  
rather than scanning a simple list, it is scanning a list of lists.   
Hence, polynomial time.


I see that E. Pasma has posted that the long scanning time has been  
fixed in a later release.  Good.


Simon
My suppositions that the time was spent in the execute step and that  
this has been fixed in the new release appeared both wrong. Thus I may  
be wrong again but I think to have an explanation now.
It is as Simon guesses that a list of lists is being scanned. It is  
however not the contents of tables being scanned, but the list of  
foreign key constraints as is loaded in memory when a database is  
opened. When preparing a DELETE statement,  the global list of FK's is  
scanned to see if the current table is referred. This is the outer  
loop. If a referring FK is found and if this has an ON DELETE clause,  
comes an inner loop on the same global list to see if the referrer is  
referred to itself. In the case that every table has such a  
constraint, as is the case here, the time becomes n * n. If I'm right  
this is hard to fix and inherent to the representation of the database  
schema in memory.


This also means that if you leave out the cascading delete from the  
constraints the time becomes linear. Actually that is what I observed  
before coming with above explanation. This was easy to check by  
extractingg the schemas from the test databases and removing ON ..  
CASCADE. Thanks for making these database available. 
___

sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Simon Slavin

On 10 Sep 2013, at 4:15pm, Harmen de Jong - CoachR Group B.V. 
 wrote:

> That is something we suspected too. We already made some tests where we timed 
> the time needed for all memory allocations executed in the entire operation. 
> In total for the 1 tables test this was somewhere around 25 msec. Since 
> this is just a little overhead and the instructions as you point out have a 
> linear increasement this still does not explain the polynomial increasement 
> in preperation time.

Polynomial time would be explained if the software is scanning every item in a 
list of lists for each table mentioned.  So a guess about what's happening is 
that for each DELETE FROM instruction the software has to look up the entry 
being deleted, and to check that none of the items in any of the other tables 
refers to it.

This makes sense given how FOREIGN KEYs are implemented in SQLite: there is no 
master 'check-for-changes' table, but each table which has the foreign key has 
to be scanned.  It is irrelevant that there are no entries in any of these 
tables: the table (actually, the index) still has to be found so that SQLite 
can figure out that there are no entries in it.  Because SQLite hashes table 
names, rather than scanning a simple list, it is scanning a list of lists.  
Hence, polynomial time.

I see that E. Pasma has posted that the long scanning time has been fixed in a 
later release.  Good.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 16:44, "E.Pasma"  wrote:

> Op 10 sep 2013, om 16:36 heeft Harmen de Jong - CoachR Group B.V. het 
> volgende geschreven:
> 
>> On 10 sep. 2013, at 16:16, "E.Pasma"  wrote:
>> 
>>> Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V. het 
>>> volgende geschreven:
 I included 5 databases that we used for testing in this link: 
 http://wikisend.com/download/570088/test_databases.zip
 
 The query performed on these databases is:
 delete from A where id=1;
>>> 
>>> I could not resist trying this but the tables in the databases appear all 
>>> empty and I get neglectible timings. Do I need to run an insert script 
>>> first?
>> 
>> No, it is all about preparing, so there is no need to insert data. When we 
>> perform the query "delete from A where id=1;"  on the databases from the zip 
>> file, we get the following timings:
>> 
>> 500 tables - 10 msec in total
>> 1000 tables - 25 msec in total
>> 5000 tables - 298 msec in total
>> 1 tables - 985 msec in total
>> 
>> What we would indeed expect is to have neglectible timings. Perhaps our 
>> interpretation of neglectible differs? Could you please tell me what your 
>> timing is on the 1 tables database?
>> 
>> We used the .timer ON command in the SQLite command shell utility as well as 
>> timing from our own application that has SQLite integrated.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> Clear, and affter downgrading to sqlite version 3.7.7.1 I got exactly your 
> timings.
> But with SQLite version 3.7.17 or 3.8.0.1 the timings are neglectible.
> Hope this is good news.

Unfortunately not; forgot to mention that foreign keys have to be turned on 
since these are off by default in the SQLite command shell utility and we found 
the time consuming operations are located somewhere in sqlite3FKActions. Could 
you please test again with option "PRAGMA foreign_keys=1;"? Then time 
increasement is no longer linear.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread E.Pasma

Op 10 sep 2013, om 16:36 heeft Harmen de Jong - CoachR Group B.V. het  
volgende geschreven:



On 10 sep. 2013, at 16:16, "E.Pasma"  wrote:

Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V.  
het volgende geschreven:

I included 5 databases that we used for testing in this link: 
http://wikisend.com/download/570088/test_databases.zip

The query performed on these databases is:
delete from A where id=1;


I could not resist trying this but the tables in the databases  
appear all empty and I get neglectible timings. Do I need to run an  
insert script first?


No, it is all about preparing, so there is no need to insert data.  
When we perform the query "delete from A where id=1;"  on the  
databases from the zip file, we get the following timings:


500 tables - 10 msec in total
1000 tables - 25 msec in total
5000 tables - 298 msec in total
1 tables - 985 msec in total

What we would indeed expect is to have neglectible timings. Perhaps  
our interpretation of neglectible differs? Could you please tell me  
what your timing is on the 1 tables database?


We used the .timer ON command in the SQLite command shell utility as  
well as timing from our own application that has SQLite integrated.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



Clear, and affter downgrading to sqlite version 3.7.7.1 I got exactly  
your timings.

But with SQLite version 3.7.17 or 3.8.0.1 the timings are neglectible.
Hope this is good news.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Keith Medcalf


> >> het volgende geschreven:
> >>> I included 5 databases that we used for testing in this link:
> http://wikisend.com/download/570088/test_databases.zip
> >>>
> >>> The query performed on these databases is:
> >>> delete from A where id=1;
> >>
> >> I could not resist trying this but the tables in the databases
> >> appear all empty and I get neglectible timings. Do I need to run an
> >> insert script first?
> >
> > No, it is all about preparing, so there is no need to insert data.
> > When we perform the query "delete from A where id=1;"  on the
> > databases from the zip file, we get the following timings:
> >
> > 500 tables - 10 msec in total
> > 1000 tables - 25 msec in total
> > 5000 tables - 298 msec in total
> > 1 tables - 985 msec in total
> >
> > What we would indeed expect is to have neglectible timings. Perhaps
> > our interpretation of neglectible differs? Could you please tell me
> > what your timing is on the 1 tables database?
> >
> > We used the .timer ON command in the SQLite command shell utility as
> > well as timing from our own application that has SQLite integrated.

The vdbe program is 20 times larger for 1 tables versus 500 tables (400K 
instructions for the later -vs- 20K for the former), and this appears to be 
where the time is spent -- generating all the foreign key checking and trigger 
code.  Perhaps the memory allocator has issues when building such a huge vdbe 
set (I haven't looked at the code, but is it perchance doing a lot of realloc's 
as more triggers are added?)




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

Sent from my iPad

On 10 sep. 2013, at 17:04, "Keith Medcalf"  wrote:

>>> No, it is all about preparing, so there is no need to insert data.
>>> When we perform the query "delete from A where id=1;"  on the
>>> databases from the zip file, we get the following timings:
>>> 
>>> 500 tables - 10 msec in total
>>> 1000 tables - 25 msec in total
>>> 5000 tables - 298 msec in total
>>> 1 tables - 985 msec in total
>>> 
>>> What we would indeed expect is to have neglectible timings. Perhaps
>>> our interpretation of neglectible differs? Could you please tell me
>>> what your timing is on the 1 tables database?
>>> 
>>> We used the .timer ON command in the SQLite command shell utility as
>>> well as timing from our own application that has SQLite integrated.
> 
> The vdbe program is 20 times larger for 1 tables versus 500 tables (400K 
> instructions for the later -vs- 20K for the former), and this appears to be 
> where the time is spent -- generating all the foreign key checking and 
> trigger code.  Perhaps the memory allocator has issues when building such a 
> huge vdbe set (I haven't looked at the code, but is it perchance doing a lot 
> of realloc's as more triggers are added?)

That is something we suspected too. We already made some tests where we timed 
the time needed for all memory allocations executed in the entire operation. In 
total for the 1 tables test this was somewhere around 25 msec. Since this 
is just a little overhead and the instructions as you point out have a linear 
increasement this still does not explain the polynomial increasement in 
preperation time.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 16:16, "E.Pasma"  wrote:

> Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V. het 
> volgende geschreven:
>> I included 5 databases that we used for testing in this link: 
>> http://wikisend.com/download/570088/test_databases.zip
>> 
>> The query performed on these databases is:
>> delete from A where id=1;
> 
> I could not resist trying this but the tables in the databases appear all 
> empty and I get neglectible timings. Do I need to run an insert script first?

No, it is all about preparing, so there is no need to insert data. When we 
perform the query "delete from A where id=1;"  on the databases from the zip 
file, we get the following timings:

500 tables - 10 msec in total
1000 tables - 25 msec in total
5000 tables - 298 msec in total
1 tables - 985 msec in total

What we would indeed expect is to have neglectible timings. Perhaps our 
interpretation of neglectible differs? Could you please tell me what your 
timing is on the 1 tables database?

We used the .timer ON command in the SQLite command shell utility as well as 
timing from our own application that has SQLite integrated.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread E.Pasma

Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V. het  
volgende geschreven:

I included 5 databases that we used for testing in this link: 
http://wikisend.com/download/570088/test_databases.zip

The query performed on these databases is:
delete from A where id=1;


I could not resist trying this but the tables in the databases appear  
all empty and I get neglectible timings. Do I need to run an insert  
script first?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 15:41, "Igor Tandetnik"  wrote:

> Not exponential - polynomial. Between 500 and 1 the size of input 
> increases x20, so the time increase of x400 would be consistent with a 
> quadratic algorithm. Your observed measurements are even better than that.

You're right, thanks for correcting me. However, our basic issue stays the 
same. We would expect an linear increasement while the increasement is actually 
polynomial and still can't figure out why the increasement in time could not be 
linear in this case.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Igor Tandetnik


On 9/10/2013 5:37 AM, Harmen de Jong - CoachR Group B.V. wrote:

The time factors it takes on each database are as follows (where the time 
needed for the 500 tables was taken as starting point to calculate the other 
factors):
500 tables -  1x
1000 tables - 2.5x
5000 tables - 29x
1 tables - 98x

As you can see this is an exponential growth in time it takes to execte the 
query.


Not exponential - polynomial. Between 500 and 1 the size of input 
increases x20, so the time increase of x400 would be consistent with a 
quadratic algorithm. Your observed measurements are even better than that.


Exponential means adding each one new table would cause the running time 
to be multiplied by some factor. You certainly don't have it *that* bad.

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 14:43, "E.Pasma"  wrote:

> The timings do not look truly exponential to me. It looks more as if there is 
> a graduated charge  (NL: staffeltoeslag) on the time per table. For instance:
> table 1 - 500 - 2 msec/table
> table 501 - 1.000 - 3 msec/table
> table 1001 - 5.000 - 5 msec/table
> table 5001 - 10.000 - 10 msec/table
> It could be that there is no further extra charge above 10.000 tables. Is it 
> worth trying that or did you do that already?

I'm sorry, but I think the way I wrote our timings were not that clear, since 
they are definately exponentially. The numbers from my previous post refer to 
the multiplier between the test cases. Just to make it clear, here follows the 
same tests, but then expressed in msec of total time per test.

500 tables - 10 msec in total
1000 tables - 25 msec in total
5000 tables - 298 msec in total
1 tables - 985 msec in total

> Your last observation, that the time is spent in sqlite3FKActions, means that 
> this is not in query preparation but in ecexution (if I understand the code 
> right). That is understandable because the engine needs to perform a vast 
> amount of internal internal queries to check all the related tables.

sqlite3FKActions is part of the call stack create by sqlite3_prepare_v2, so I 
think it actually is part of the prepare. The execution time of sqlite3_step is 
actually negligible (< 1 msec) in our test cases because nothing is actually 
deleted because all tables are empty in the test case.

> This leaves room for the hypothesis that the observed increase is a matter of 
> caching. For instance the sqlite default page cache may just fit 2.000 tables 
> but not 10.000. Is. It may be worth to try an increased cache_size.

We already have a very high cache_size set in our test cases. We have set it to 
819200, so I think that can be ruled out too.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread E.Pasma

Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V. het  
volgende geschreven:



On 9 sep. 2013, at 22:11, "E.Pasma"  wrote:
Ha, I did not mean the length of the names but the length of the  
hash table (NL: klutstabel), That is the number of buckets over  
which the hash values are distributed. I looked some further in the  
code and now believe that this is 1024. That is stilll generous for  
a normal database, But with 10.000 tables you will have about 10  
table names in each bucket. So the enigine wiill need to do a lot  
of searching within each bucket. That involves checking the name of  
each item in the bucket with the searched name.


We've looked into that code. Actually the 1024 is not the number of  
buckets used for the hash table, but the total size. So we did some  
tests by changing the 1024 bytes into a value large enough to store  
the hash table for 10K tables, but unfortunately that gave a  
performance increasement of 'just' 5 percent (which is at least  
something, but still does not get us to the point where it is fast  
enough for our usage and does not prevent an exponential growth in  
time as the number of tables increases).


Still we have no idea why the query preparation time increases  
exponentially instead of linearly which is what we would expect  
since it is using a hash table.


I included 5 databases that we used for testing in this link: 
http://wikisend.com/download/570088/test_databases.zip

The query performed on these databases is:
delete from A where id=1;

The time factors it takes on each database are as follows (where the  
time needed for the 500 tables was taken as starting point to  
calculate the other factors):

500 tables -  1x
1000 tables - 2.5x
5000 tables - 29x
1 tables - 98x

As you can see this is an exponential growth in time it takes to  
execte the query. So far we're missing the point of why this growth  
should be exponential.

...
We tried some further debugging and it seems that SQLite spends its  
time in "sqlite3FkActions", though we cannot find anything in there  
that would explain an exponential groth in time.

___



The timings do not look truly exponential to me. It looks more as if  
there is a graduated charge  (NL: staffeltoeslag) on the time per  
table. For instance:

table 1 - 500 - 2 msec/table
table 501 - 1.000 - 3 msec/table
table 1001 - 5.000 - 5 msec/table
table 5001 - 10.000 - 10 msec/table
It could be that there is no further extra charge above 10.000 tables.  
Is it worth trying that or did you do that already?


Your last observation, that the time is spent in sqlite3FKActions,  
means that this is not in query preparation but in ecexution (if I  
understand the code right). That is understandable because the engine  
needs to perform a vast amount of internal internal queries to check  
all the related tables.


This leaves room for the hypothesis that the observed increase is a  
matter of caching. For instance the sqlite default page cache may just  
fit 2.000 tables but not 10.000. Is. It may be worth to try an  
increased cache_size.








___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 10 sep. 2013, at 11:37, "Harmen de Jong - CoachR Group B.V." 
 wrote:
> As you can see this is an exponential growth in time it takes to execte the 
> query. So far we're missing the point of why this growth should be 
> exponential.

We tried some further debugging and it seems that SQLite spends its time in 
"sqlite3FkActions", though we cannot find anything in there that would explain 
an exponential groth in time.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-10 Thread Harmen de Jong - CoachR Group B . V .

On 9 sep. 2013, at 22:11, "E.Pasma"  wrote:
> Ha, I did not mean the length of the names but the length of the hash table 
> (NL: klutstabel), That is the number of buckets over which the hash values 
> are distributed. I looked some further in the code and now believe that this 
> is 1024. That is stilll generous for a normal database, But with 10.000 
> tables you will have about 10 table names in each bucket. So the enigine 
> wiill need to do a lot of searching within each bucket. That involves 
> checking the name of each item in the bucket with the searched name.

We've looked into that code. Actually the 1024 is not the number of buckets 
used for the hash table, but the total size. So we did some tests by changing 
the 1024 bytes into a value large enough to store the hash table for 10K 
tables, but unfortunately that gave a performance increasement of 'just' 5 
percent (which is at least something, but still does not get us to the point 
where it is fast enough for our usage and does not prevent an exponential 
growth in time as the number of tables increases).

Still we have no idea why the query preparation time increases exponentially 
instead of linearly which is what we would expect since it is using a hash 
table. 

I included 5 databases that we used for testing in this link: 
http://wikisend.com/download/570088/test_databases.zip 

The query performed on these databases is:
delete from A where id=1;

The time factors it takes on each database are as follows (where the time 
needed for the 500 tables was taken as starting point to calculate the other 
factors):
500 tables -  1x
1000 tables - 2.5x
5000 tables - 29x
1 tables - 98x

As you can see this is an exponential growth in time it takes to execte the 
query. So far we're missing the point of why this growth should be exponential.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-09 Thread E.Pasma



Op 9 sep 2013, om 10:06 heeft Harmen de Jong - CoachR Group B.V. het  
volgende geschreven:


Our table and column  names are not too long either as E.Pasma  
suggests.


Ha, I did not mean the length of the names but the length of the hash  
table (NL: klutstabel), That is the number of buckets over which the  
hash values are distributed. I looked some further in the code and now  
believe that this is 1024. That is stilll generous for a normal  
database, But with 10.000 tables you will have about 10 table names in  
each bucket. So the enigine wiill need to do a lot of searching within  
each bucket. That involves checking the name of each item in the  
bucket with the searched name.


With regards to the number buckets I found a meaningful comment in the  
code:


  /* The inability to allocates space for a larger hash table is
  ** a performance hit but it is not a fatal error.  So mark the
  ** allocation as a benign.
  */

Another generousity is that the code uses a constant for the length of  
the hash tables so you can change that by a single edit. There are of  
course advantages and disadvantages of doing that,.


 
   
___

sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-09 Thread Harmen de Jong - CoachR Group B . V .

On 7 sep. 2013, at 04:45, "David de Regt"  wrote:

> Mayhaps the CROSS JOIN trick is your friend in this case, if you can be 
> pretty sure of the correct direction of the join order. :)

In the examples I gave it was actually about a simple delete query from one 
table without any joins, so that still does not explain the huge performance 
decrease.

> From: sqlite-users-boun...@sqlite.org 
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of James K. Lowden
> Sent: Friday, September 6, 2013 7:40 PM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Query preperation time does not scale linearly with 
> growth of no. of tables
> 
> Factorial, actually.  After three tables, each addtional table increases 
> potential join sequences by roughly an order of magnitude.  

This factorial increasement would be the case if there were any joins, but 
there are none.

Does any one have any other explanation to why the performance gets so bad when 
having many tables?

Our table and column  names are not too long either as E.Pasma suggests.

Best regards,
Harmen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-07 Thread E.Pasma

The developers choose the C type 'int' to represent the hash value.  
Possibly this is too small for your case?


Op 6 sep 2013, om 22:00 heeft Harmen de Jong - CoachR Group B.V. het  
volgende geschreven:


On 6 sep. 2013, at 20:09, "Kevin Benson"   
wrote:
Dr. Hipp does a little bit of explaining on this topic, generally,  
in his

replies on this thread:

http://www.mail-archive.com/sqlite-users@sqlite.org/msg78602.html


Thanks for pointing me to that thread, but as dr. Hipp states in  
this thread, the tables are stored in a hash. Therefore I would not  
expect a large performance decrease on large number of tables at  
all, or am I missing something?


Best regards,
Harmen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread David de Regt

Mayhaps the CROSS JOIN trick is your friend in this case, if you can be pretty 
sure of the correct direction of the join order. :)

-David

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of James K. Lowden
Sent: Friday, September 6, 2013 7:40 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Query preperation time does not scale linearly with 
growth of no. of tables

On Fri, 6 Sep 2013 17:29:25 +
Harmen de Jong - CoachR Group B.V.  wrote:

> > If I recall correctly, query planner's behavior is worst-case 
> > quadratic in the number of tables participating in the query. This 
> > includes tables mentioned directly, and also those pulled in 
> > indirectly via views, triggers or foreign keys.

Factorial, actually.  After three tables, each addtional table increases 
potential join sequences by roughly an order of magnitude.  

Given tables A, B, and C, 1 * 2 * 3 = 6: 

sqlite>  select a.T, b.T, c.T  from F a join F b on a.T  <> b.T join F c 
sqlite> on b.T <> c.T where a.T <> c.T order by a.T, b.T, c.T;
A   B   C 
A   C   B 
B   A   C 
B   C   A 
C   A   B 
C   B   A  

That's six plans for the order in which the system could choose to
access the tables to execute the query. 

Factorial grows quickly, as is demonstrated by adding table D:

sqlite> select a.T, b.T, c.T, d.T  from F a join F b on a.T <> b.T
> cross join F c on b.T <> c.T join F as d on c.T <> d.T where a.T <> 
> c.T and a.T <> d.T and b.T <> d.T order by a.T, b.T, c.T, d.T;
A   B   C   D 
A   B   D   C 
A   C   B   D 
A   C   D   B 
A   D   B   C 
A   D   C   B 
B   A   C   D 
B   A   D   C 
B   C   A   D 
B   C   D   A 
B   D   A   C 
B   D   C   A 
C   A   B   D 
C   A   D   B 
C   B   A   D 
C   B   D   A 
C   D   A   B 
C   D   B   A 
D   A   B   C 
D   A   C   B 
D   B   A   C 
D   B   C   A 
D   C   A   B 
D   C   B   A 

Pity the query optimizer facing  an 8-way join.  Or, say, a 20-table
join:

$ FACT=1; seq 20 | while read F;  do FACT=$(( ${FACT} * $F )); printf '% 3d! = 
%d\n'  $F ${FACT};  done
  1! = 1
  2! = 2
  3! = 6
  4! = 24
  5! = 120
  6! = 720
  7! = 5040
  8! = 40320
  9! = 362880
 10! = 3628800
 11! = 39916800
 12! = 479001600
 13! = 6227020800
 14! = 87178291200
 15! = 1307674368000
 16! = 20922789888000
 17! = 355687428096000
 18! = 6402373705728000
 19! = 121645100408832000
 20! = 243290200817664

There is such a thing as too many choices!  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread James K. Lowden

On Fri, 6 Sep 2013 17:29:25 +
Harmen de Jong - CoachR Group B.V.  wrote:

> > If I recall correctly, query planner's behavior is worst-case
> > quadratic in the number of tables participating in the query. This
> > includes tables mentioned directly, and also those pulled in
> > indirectly via views, triggers or foreign keys.

Factorial, actually.  After three tables, each addtional table
increases potential join sequences by roughly an order of
magnitude.  

Given tables A, B, and C, 1 * 2 * 3 = 6: 

sqlite>  select a.T, b.T, c.T  from F a join F b on a.T  <> b.T 
sqlite> join F c on b.T <> c.T where a.T <> c.T order by a.T, b.T, c.T; 
A   B   C 
A   C   B 
B   A   C 
B   C   A 
C   A   B 
C   B   A  

That's six plans for the order in which the system could choose to
access the tables to execute the query. 

Factorial grows quickly, as is demonstrated by adding table D:

sqlite> select a.T, b.T, c.T, d.T  from F a join F b on a.T <> b.T
> cross join F c on b.T <> c.T join F as d on c.T <> d.T where
> a.T <> c.T and a.T <> d.T and b.T <> d.T order by a.T, b.T,
> c.T, d.T;
A   B   C   D 
A   B   D   C 
A   C   B   D 
A   C   D   B 
A   D   B   C 
A   D   C   B 
B   A   C   D 
B   A   D   C 
B   C   A   D 
B   C   D   A 
B   D   A   C 
B   D   C   A 
C   A   B   D 
C   A   D   B 
C   B   A   D 
C   B   D   A 
C   D   A   B 
C   D   B   A 
D   A   B   C 
D   A   C   B 
D   B   A   C 
D   B   C   A 
D   C   A   B 
D   C   B   A 

Pity the query optimizer facing  an 8-way join.  Or, say, a 20-table
join:

$ FACT=1; seq 20 | while read F;  do FACT=$(( ${FACT} * $F )); printf '%
3d! = %d\n'  $F ${FACT};  done
  1! = 1
  2! = 2
  3! = 6
  4! = 24
  5! = 120
  6! = 720
  7! = 5040
  8! = 40320
  9! = 362880
 10! = 3628800
 11! = 39916800
 12! = 479001600
 13! = 6227020800
 14! = 87178291200
 15! = 1307674368000
 16! = 20922789888000
 17! = 355687428096000
 18! = 6402373705728000
 19! = 121645100408832000
 20! = 243290200817664

There is such a thing as too many choices!  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Scott Robison

Two things:

1. The longer the table names, the longer it will take to compute the hash
of each table name.

2. Because the entire schema must be reprocessed after each change, all the
table names will be rehashed after each table has been created. Creating
10,000 tables will result in  re-reading all that data and re-hashing all
the table names. After adding the 10,000th table, SQLite will have computed
at least 50,005,000 hash operations. Many more if column names are hashed
too.

SDR
On Sep 6, 2013 2:00 PM, "Harmen de Jong - CoachR Group B.V." <
har...@coachr.com> wrote:

> On 6 sep. 2013, at 20:09, "Kevin Benson"  wrote:
> > Dr. Hipp does a little bit of explaining on this topic, generally, in his
> > replies on this thread:
> >
> > http://www.mail-archive.com/sqlite-users@sqlite.org/msg78602.html
>
> Thanks for pointing me to that thread, but as dr. Hipp states in this
> thread, the tables are stored in a hash. Therefore I would not expect a
> large performance decrease on large number of tables at all, or am I
> missing something?
>
> Best regards,
> Harmen
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Simon Slavin

On 6 Sep 2013, at 9:00pm, Harmen de Jong - CoachR Group B.V. 
 wrote:

> as dr. Hipp states in this thread, the tables are stored in a hash. Therefore 
> I would not expect a large performance decrease on large number of tables at 
> all, or am I missing something?

The /names/ of the tables are hashed.  Not the data in the tables, or the list 
of page numbers that the data is stored in.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Harmen de Jong - CoachR Group B . V .

On 6 sep. 2013, at 20:09, "Kevin Benson"  wrote:
> Dr. Hipp does a little bit of explaining on this topic, generally, in his
> replies on this thread:
> 
> http://www.mail-archive.com/sqlite-users@sqlite.org/msg78602.html

Thanks for pointing me to that thread, but as dr. Hipp states in this thread, 
the tables are stored in a hash. Therefore I would not expect a large 
performance decrease on large number of tables at all, or am I missing 
something?

Best regards,
Harmen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Kevin Benson

On Fri, Sep 6, 2013 at 1:29 PM, Harmen de Jong - CoachR Group B.V. <
har...@coachr.com> wrote:

> On 6 sep. 2013, at 18:42, "Igor Tandetnik"  wrote:
> > If I recall correctly, query planner's behavior is worst-case quadratic
> in the number of tables participating in the query. This includes tables
> mentioned directly, and also those pulled in indirectly via views, triggers
> or foreign keys.
>
> Ok, maybe that explains it. Do you remember some topics or explanations
> that cover this more in depth? At least it will be worth for us to do some
> debugging on the SQLite code to see if we can pin point this behaviour and
> see if we can work around.



Dr. Hipp does a little bit of explaining on this topic, generally, in his
replies on this thread:

http://www.mail-archive.com/sqlite-users@sqlite.org/msg78602.html

--
   --
  --
 --Ô¿Ô--
K e V i N
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Harmen de Jong - CoachR Group B . V .

On 6 sep. 2013, at 18:42, "Igor Tandetnik"  wrote:
> If I recall correctly, query planner's behavior is worst-case quadratic in 
> the number of tables participating in the query. This includes tables 
> mentioned directly, and also those pulled in indirectly via views, triggers 
> or foreign keys.

Ok, maybe that explains it. Do you remember some topics or explanations that 
cover this more in depth? At least it will be worth for us to do some debugging 
on the SQLite code to see if we can pin point this behaviour and see if we can 
work around.

> If I may be so bold, I would say that a design that calls for a database with 
> 10,000 tables doesn't feel right 

The main reason for this is that the application is used to do research on and 
examine the quality of processes that take place within organizations. For this 
research the users can create highly customized  research forms with varying 
number of columns, varying field types and a lot of specific restrictions, 
which we store in tables that are generated on the fly with the proper indexes 
(later on the research results are stored in these tables and used for 
real-time dashboard reporting and downloading reports that are generated with 
the latest data)

Furthermore the data that is examined and imported into the application 
contains a lot of privately related data. Since we already create these 
separated tables, we use this as some sort of 'automated' security wall to rule 
out the risk of users from different organizations reaching or hacking into 
each others privacy sensitive data in case programming weaknesses or mistakes 
are made.

For this project we went on board with SQLite (which we already used in several 
other projects) and have built our own 'SQL server' app that takes care of 
things like:
- Asynchronous real-time backups to secondary location (sort of binary logging 
as found in other DB's, though our implementation is not binary)
-Automatically splitting time intensive writes and updates to prevent these 
locking the app (since SQLite has DB level locking) for users that perform 
simple tasks that can be performed fast.
-This 'stable' server app also rules out startup times of the database in the 
case of application crashes (since the DB is several gigabytes in size it takes 
SQLite a few seconds to 'initialize' and thereby hiding application crashes 
from the users (which access the app through a web interface).
-Automatically backing up the DB in compressed and encrypted format.

Best regards,
Harmen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Igor Tandetnik


On 9/6/2013 10:35 AM, Harmen de Jong - CoachR Group B.V. wrote:

We ran into an issue where specific queries are geting non linearly slower when 
the total number of tables grows.


If I recall correctly, query planner's behavior is worst-case quadratic 
in the number of tables participating in the query. This includes tables 
mentioned directly, and also those pulled in indirectly via views, 
triggers or foreign keys.


If I may be so bold, I would say that a design that calls for a database 
with 10,000 tables doesn't feel right to me.

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Query preperation time does not scale linearly with growth of no. of tables

2013-09-06 Thread Harmen de Jong - CoachR Group B . V .

We ran into an issue where specific queries are geting non linearly slower when 
the total number of tables grows.





Example 1 (not slow):



Database has table A

Database has 1,000 other tables with foreign key to table A

Row is deleted from table A (no deletion of actual data in other tables is 
involved).





Example 2 (more than 50 times slower than example A):



Database has table A

Database has 10,000 other tables with foreign key to table A

Row is deleted from table A (no deletion of actual data in other tables is 
involved).





Does anybody have an idea why the preparation time of the query is scaling up 
so fast in the above examples?





Best regards,



Harmen de Jong

CoachR Group B.V.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

Re: [sqlite] Query preperation time does not scale linearly with growth of no. of tables

[sqlite] Query preperation time does not scale linearly with growth of no. of tables

34 matches

Site Navigation

Mail list logo

Footer information