-----Original Message-----
Date: Tue, 10 Sep 2013 15:15:35 +0000
From: Harmen de Jong - CoachR Group B.V. <har...@coachr.com>
To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
Subject: Re: [sqlite] Query preperation time does not scale     linearly
        with    growth of no. of tables
Message-ID: <c06aa165-6904-43f6-8d33-6a0f2c99f...@coachr.com>
Content-Type: text/plain; charset="us-ascii"

[snip]
That [memory being re-allocated] is something we suspected too. We already made 
some tests where we timed the time needed for all memory allocations executed 
in the entire operation. In total for the 10000 tables test this was somewhere 
around 25 msec. Since this is just a little overhead and the instructions as 
you point out have a linear increasement this still does not explain the 
polynomial increasement in preperation time.

[J. Merrill's comment below]
When there's a re-allocate, it's not just the time to allocate the new memory 
-- there's the time to copy the data from the original not-big-enough location 
to the new big-enough-for-now location. The amount of data moved grows rapidly 
in that case with each re-allocation, and the pattern of extra-time seems 
similar to the data-movement-growth when there are repeated allocate/move 
steps. How many re-allocations happen?

Do you know by how much the allocation size increases each time a re-allocation 
is needed? (A common algorithm is to double the size each time.) You might try 
changing the source code so that it triples or quadruples the size each time, 
as memory does not seem to be an issue.

Another point -- I did not see you comment on the possibility that you could 
remove the FK specs from the 10,000 tables. Do you really have the situation 
that deletions from the main table would "orphan" rows in an unknown number of 
other tables, and the existence of those orphan rows would in fact cause 
application failures? (If the rows were not deleted from the other tables, 
using a normal join to the main table -- rather than a left join -- in the 
queries would make those rows disappear.)

Alternatively you could create a table to hold the ids of the rows deleted from 
the main table, and build a separate process -- to be run as often as desired 
-- to do

delete from onechildtable where id in (select id from deletedids)

(That SQL could also be of the form
   delete from onechildtable where id in (5,6,9,12,999)
should you determine that there aren't very many deleted ids.)

You would do that for each of the other tables -- this is exactly the work that 
SQLite is preparing to do when you prepare a "delete from maintable" statement. 
When done deleting from all the tables, you could remove all the rows from the 
deletedids table. (You'd have to do something to block deletions from the main 
table -- thus blocking insertion into the deletedids table -- during the 
process, and of course you would start the process with "do nothing if 
deletedids has no rows".)

This would just have the effect of batching together multiple main-table 
deletes before going through every child table for the purpose of deleting 
orphans.

I would not suggest the "delete in batches" idea except that it's clear that 
you have a major collection of infrastructure set up to handle this unusual 
task. I don't think that what I'm suggesting would add a huge new component to 
that infrastructure.


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to