On Mon, Oct 11, 2010 at 8:13 AM, Jay A. Kreibich <j...@kreibi.ch> wrote: > On Mon, Oct 11, 2010 at 02:08:54PM +0200, Michele Pradella scratched on the > wall: >> Ok so the main idea it's always the same: split the DELETE to make the >> operation on less records, but do it more often. > > Another thought occurs to me... If your insert rates are fairly > consistent (e.g. the number of records per minute is consistent, or > at least predictable) then you can just use the table as a circular > buffer. That is, rather than inserting new data and deleting the > old data, simply overwrite the older records with the new data. > Rather than inserting new records, simply find the oldest record and > update it with the new data.
In a past life, I ran a large production system on MYSQL, and what we did in a similar situation is add the appropriate delete after inserts. Something like: INSERT INTO my_table VALUES (blah, blah, blah); DELETE FROM my_table WHERE date_or_whatever < some_limit LIMIT 5; If your insert load consistently tracks from day to day, this mostly looks like a circular buffer. The main downside is that it makes your heaviest insert period also be your heaviest delete period, which is unfortunate. The LIMIT 5 is there so that if your volume varies, you eventually catch up. If your volume varies a lot, you probably would need to increase the value. It might seem like this will make inserts twice as slow, but I'd implement it this way before complexifying things. Since the index the delete runs against is probably in the page cache, you might find that it works just fine. If I had to modify it, I might have it do the delete on every 100th insert, and change the limit to 100 + epsilon. After the delete, use sqlite3_changes() to check whether it deleted to the limit, and increase epsilon a bit for the next pass, until the limit is not reached, then drop epsilon back towards 0. But, like I said, you might find that the simple solution does the job just fine. Modifying your selects to ignore too-old records shouldn't slow things down materially. If you had done the query 10 minutes ago, you'd have gotten those 25 records, and when you do it now you get those 25 records plus 1 more, and throw away 3 of them - but if it would have been adequate performance 10 minutes ago, why is it not adequate now? So long as you keep the number of filtered records contained, it's just a slight constant factor, and if you keep them very tightly contained, the cost is almost minimal unless the query could otherwise run against indices along (once SQLite has fetched the row from the table, doing an additional check on data within that row is generally not expensive). -scott _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users