Re: [sqlite] Speed up DELETE of a lot of records

Scott Hess Tue, 12 Oct 2010 10:30:46 -0700

On Mon, Oct 11, 2010 at 8:13 AM, Jay A. Kreibich <j...@kreibi.ch> wrote:
> On Mon, Oct 11, 2010 at 02:08:54PM +0200, Michele Pradella scratched on the 
> wall:
>>   Ok so the main idea it's always the same: split the DELETE to make the
>> operation on less records, but do it more often.
>
>  Another thought occurs to me...   If your insert rates are fairly
>  consistent (e.g. the number of records per minute is consistent, or
>  at least predictable) then you can just use the table as a circular
>  buffer.  That is, rather than inserting new data and deleting the
>  old data, simply overwrite the older records with the new data.
>  Rather than inserting new records, simply find the oldest record and
>  update it with the new data.


In a past life, I ran a large production system on MYSQL, and what we
did in a similar situation is add the appropriate delete after
inserts.  Something like:

  INSERT INTO my_table VALUES (blah, blah, blah);
  DELETE FROM my_table WHERE date_or_whatever < some_limit LIMIT 5;

If your insert load consistently tracks from day to day, this mostly
looks like a circular buffer.  The main downside is that it makes your
heaviest insert period also be your heaviest delete period, which is
unfortunate.  The LIMIT 5 is there so that if your volume varies, you
eventually catch up.  If your volume varies a lot, you probably would
need to increase the value.

It might seem like this will make inserts twice as slow, but I'd
implement it this way before complexifying things.  Since the index
the delete runs against is probably in the page cache, you might find
that it works just fine.  If I had to modify it, I might have it do
the delete on every 100th insert, and change the limit to 100 +
epsilon.  After the delete, use sqlite3_changes() to check whether it
deleted to the limit, and increase epsilon a bit for the next pass,
until the limit is not reached, then drop epsilon back towards 0.
But, like I said, you might find that the simple solution does the job
just fine.

Modifying your selects to ignore too-old records shouldn't slow things
down materially.  If you had done the query 10 minutes ago, you'd have
gotten those 25 records, and when you do it now you get those 25
records plus 1 more, and throw away 3 of them - but if it would have
been adequate performance 10 minutes ago, why is it not adequate now?
So long as you keep the number of filtered records contained, it's
just a slight constant factor, and if you keep them very tightly
contained, the cost is almost minimal unless the query could otherwise
run against indices along (once SQLite has fetched the row from the
table, doing an additional check on data within that row is generally
not expensive).

-scott
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Speed up DELETE of a lot of records

Reply via email to