Hmmm...is this a disk head seeking problem? You've got several TEXT entries which are either NULL or default to ''; I did a small experiment: on 3.7.13 CREATE TABLE container_stat ( account TEXT, container TEXT, created_at TEXT, put_timestamp TEXT DEFAULT '0', delete_timestamp TEXT DEFAULT '0', object_count INTEGER, bytes_used INTEGER, reported_put_timestamp TEXT DEFAULT '0', reported_delete_timestamp TEXT DEFAULT '0', reported_object_count INTEGER DEFAULT 0, reported_bytes_used INTEGER DEFAULT 0, hash TEXT default '00000000000000000000000000000000', id TEXT, status TEXT DEFAULT '', status_changed_at TEXT DEFAULT '0', metadata TEXT DEFAULT '', x_container_sync_point1 INTEGER DEFAULT -1, x_container_sync_point2 INTEGER DEFAULT -1); insert into container_stat(id,status,status_changed_at) values('id1','status1',''); insert into container_stat(id,status,status_changed_at) values('id2','status2',''); insert into container_stat(id,status,status_changed_at) values('id3','status3',''); insert into container_stat(id,status,status_changed_at) values('id4','status4',''); insert into container_stat(id,status,status_changed_at) values('id5','status5',''); delete from container_stat where account='id1'; insert into container_stat(account,status,status_changed_at) values('id1','status1 change1 to something else','status_changhed_at_1');
If you look at the order of data BEFORE the delete/insert occurs you get this (using "strings"); 000000000000000000000000000000000000id5status5 000000000000000000000000000000000000id4status4 000000000000000000000000000000000000id3status3 000000000000000000000000000000000000id2status2 000000000000000000000000000000000000id1status1 After a delete/insert where one of the fields grows in size you get this: id1000000000000000000000000000000000000status1 change1 to something elsestatus_changed_at_1 000000000000000000000000000000000000id5status5 000000000000000000000000000000000000id4status4 000000000000000000000000000000000000id3status3 000000000000000000000000000000000000id2status2 000000000000000000000000000000000000id1status1 So...the probability of an update needing to move due to larger data increases the closer you are to the beginning of the database. Each update would reduce the likelihood of that record getting relocated again as the fields grow in size. If you use default values that are 2 sigma of the sizes of your strings (or maybe just 2X the average length or so) you would reduce the head seek time on updating records. It basically sounds possibly like your disk head is slewing from the beginning of the file to the end on many of your updates. Also...would "insert or update" help you a bit? http://www.sqlite.org/lang_conflict.html Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems ________________________________________ From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of O'Toole, Eamonn [eamonn.oto...@hp.com] Sent: Friday, November 09, 2012 5:07 AM To: sqlite-users@sqlite.org Subject: EXT :[sqlite] Anomalously slow performance on updates to early entries in a DB Hello all, First of all, I'm a complete novice with respect to SQLite so apologies if there is an obvious answer to my question. I've also posted this question in IRC, and it was suggested that I post the question to this mailing list. We're running a test-bed of an object store (Openstack Swift) which uses SQLite to record information on the objects that are stored in a container. The table itself (called "object") is very simple, it contains the modification time of the object, the size of object, the md5sum of the object, and the content-type of the object. We are seeing a performance anomaly on updates to existing object records in the SQLite DB. If the container DB is sufficiently large (about 10 million objects, 3.3GB) then the time to update records at the beginning of the database by order of entry is anomalously high. The time is particularly bad for the first approx. 100K records, is somewhat better for the next 900K records, and settles down to a consistent average from approx 1 million records on. If this consistent average time is around 7 seconds for 10,000 updates, then we see times of about 170 seconds for 10,000 updates on the first 100K records. We don't see this anomalously high update time if we start the updates after the first 1 million records. Note that table updates are performed by first DELETEing the entry and then INSERTing the changed entry. Does anybody have any idea why we're seeing this behaviour, and what we can do to fix it? Note that Chunky_Ks (IRC) has suggested that B-tree rebalancing could be the fundamental issue. Thanks. SQL Schema follows: 1 database: main Tables: container_stat incoming_sync object outgoing_sync container_stat: CREATE TABLE container_stat ( account TEXT, container TEXT, created_at TEXT, put_timestamp TEXT DEFAULT '0', delete_timestamp TEXT DEFAULT '0', object_count INTEGER, bytes_used INTEGER, reported_put_timestamp TEXT DEFAULT '0', reported_delete_timestamp TEXT DEFAULT '0', reported_object_count INTEGER DEFAULT 0, reported_bytes_used INTEGER DEFAULT 0, hash TEXT default '00000000000000000000000000000000', id TEXT, status TEXT DEFAULT '', status_changed_at TEXT DEFAULT '0', metadata TEXT DEFAULT '', x_container_sync_point1 INTEGER DEFAULT -1, x_container_sync_point2 INTEGER DEFAULT -1 incoming_sync: CREATE TABLE incoming_sync ( remote_id TEXT UNIQUE, sync_point INTEGER, updated_at TEXT DEFAULT 0 ); CREATE TRIGGER incoming_sync_insert AFTER INSERT ON incoming_sync BEGIN UPDATE incoming_sync SET updated_at = STRFTIME('%s', 'NOW') WHERE ROWID = new.ROWID; END; CREATE TRIGGER incoming_sync_update AFTER UPDATE ON incoming_sync BEGIN UPDATE incoming_sync SET updated_at = STRFTIME('%s', 'NOW') WHERE ROWID = new.ROWID; END; object: CREATE TABLE object ( ROWID INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, created_at TEXT, size INTEGER, content_type TEXT, etag TEXT, deleted INTEGER DEFAULT 0 ); CREATE INDEX ix_object_deleted_name ON object (deleted, name); CREATE TRIGGER object_delete AFTER DELETE ON object BEGIN UPDATE container_stat SET object_count = object_count - (1 - old.deleted), bytes_used = bytes_used - old.size, hash = chexor(hash, old.name, old.created_at); END; CREATE TRIGGER object_insert AFTER INSERT ON object BEGIN UPDATE container_stat SET object_count = object_count + (1 - new.deleted), bytes_used = bytes_used + new.size, hash = chexor(hash, new.name, new.created_at); END; CREATE TRIGGER object_update BEFORE UPDATE ON object BEGIN SELECT RAISE(FAIL, 'UPDATE not allowed; DELETE and INSERT'); END; outgoing_sync: CREATE TABLE outgoing_sync ( remote_id TEXT UNIQUE, sync_point INTEGER, updated_at TEXT DEFAULT 0 ); CREATE TRIGGER outgoing_sync_insert AFTER INSERT ON outgoing_sync BEGIN UPDATE outgoing_sync SET updated_at = STRFTIME('%s', 'NOW') WHERE ROWID = new.ROWID; END; CREATE TRIGGER outgoing_sync_update AFTER UPDATE ON outgoing_sync BEGIN UPDATE outgoing_sync SET updated_at = STRFTIME('%s', 'NOW') WHERE ROWID = new.ROWID; END; _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users