Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V. het volgende geschreven:

On 9 sep. 2013, at 22:11, "E.Pasma" <pasm...@concepts.nl> wrote:
Ha, I did not mean the length of the names but the length of the hash table (NL: klutstabel), That is the number of buckets over which the hash values are distributed. I looked some further in the code and now believe that this is 1024. That is stilll generous for a normal database, But with 10.000 tables you will have about 10 table names in each bucket. So the enigine wiill need to do a lot of searching within each bucket. That involves checking the name of each item in the bucket with the searched name.

We've looked into that code. Actually the 1024 is not the number of buckets used for the hash table, but the total size. So we did some tests by changing the 1024 bytes into a value large enough to store the hash table for 10K tables, but unfortunately that gave a performance increasement of 'just' 5 percent (which is at least something, but still does not get us to the point where it is fast enough for our usage and does not prevent an exponential growth in time as the number of tables increases).

Still we have no idea why the query preparation time increases exponentially instead of linearly which is what we would expect since it is using a hash table.

I included 5 databases that we used for testing in this link: 
http://wikisend.com/download/570088/test_databases.zip

The query performed on these databases is:
delete from A where id=1;

The time factors it takes on each database are as follows (where the time needed for the 500 tables was taken as starting point to calculate the other factors):
500 tables -  1x
1000 tables - 2.5x
5000 tables - 29x
10000 tables - 98x

As you can see this is an exponential growth in time it takes to execte the query. So far we're missing the point of why this growth should be exponential.
...
We tried some further debugging and it seems that SQLite spends its time in "sqlite3FkActions", though we cannot find anything in there that would explain an exponential groth in time.
___________


The timings do not look truly exponential to me. It looks more as if there is a graduated charge (NL: staffeltoeslag) on the time per table. For instance:
table 1 - 500 - 2 msec/table
table 501 - 1.000 - 3 msec/table
table 1001 - 5.000 - 5 msec/table
table 5001 - 10.000 - 10 msec/table
It could be that there is no further extra charge above 10.000 tables. Is it worth trying that or did you do that already?

Your last observation, that the time is spent in sqlite3FKActions, means that this is not in query preparation but in ecexution (if I understand the code right). That is understandable because the engine needs to perform a vast amount of internal internal queries to check all the related tables.

This leaves room for the hypothesis that the observed increase is a matter of caching. For instance the sqlite default page cache may just fit 2.000 tables but not 10.000. Is. It may be worth to try an increased cache_size.







_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to