Op 10 sep 2013, om 11:37 heeft Harmen de Jong - CoachR Group B.V. het
volgende geschreven:
On 9 sep. 2013, at 22:11, "E.Pasma" <pasm...@concepts.nl> wrote:
Ha, I did not mean the length of the names but the length of the
hash table (NL: klutstabel), That is the number of buckets over
which the hash values are distributed. I looked some further in the
code and now believe that this is 1024. That is stilll generous for
a normal database, But with 10.000 tables you will have about 10
table names in each bucket. So the enigine wiill need to do a lot
of searching within each bucket. That involves checking the name of
each item in the bucket with the searched name.
We've looked into that code. Actually the 1024 is not the number of
buckets used for the hash table, but the total size. So we did some
tests by changing the 1024 bytes into a value large enough to store
the hash table for 10K tables, but unfortunately that gave a
performance increasement of 'just' 5 percent (which is at least
something, but still does not get us to the point where it is fast
enough for our usage and does not prevent an exponential growth in
time as the number of tables increases).
Still we have no idea why the query preparation time increases
exponentially instead of linearly which is what we would expect
since it is using a hash table.
I included 5 databases that we used for testing in this link:
http://wikisend.com/download/570088/test_databases.zip
The query performed on these databases is:
delete from A where id=1;
The time factors it takes on each database are as follows (where the
time needed for the 500 tables was taken as starting point to
calculate the other factors):
500 tables - 1x
1000 tables - 2.5x
5000 tables - 29x
10000 tables - 98x
As you can see this is an exponential growth in time it takes to
execte the query. So far we're missing the point of why this growth
should be exponential.
...
We tried some further debugging and it seems that SQLite spends its
time in "sqlite3FkActions", though we cannot find anything in there
that would explain an exponential groth in time.
___________
The timings do not look truly exponential to me. It looks more as if
there is a graduated charge (NL: staffeltoeslag) on the time per
table. For instance:
table 1 - 500 - 2 msec/table
table 501 - 1.000 - 3 msec/table
table 1001 - 5.000 - 5 msec/table
table 5001 - 10.000 - 10 msec/table
It could be that there is no further extra charge above 10.000 tables.
Is it worth trying that or did you do that already?
Your last observation, that the time is spent in sqlite3FKActions,
means that this is not in query preparation but in ecexution (if I
understand the code right). That is understandable because the engine
needs to perform a vast amount of internal internal queries to check
all the related tables.
This leaves room for the hypothesis that the observed increase is a
matter of caching. For instance the sqlite default page cache may just
fit 2.000 tables but not 10.000. Is. It may be worth to try an
increased cache_size.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users