Thanks Michael. Comments below: > -----Original Message----- > From: [email protected] [mailto:sqlite-users- > [email protected]] On Behalf Of Black, Michael (IS) > Sent: 09 November 2012 13:49 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Anomalously slow performance on updates to early > entries in a DB > > Hmmm...is this a disk head seeking problem? > You've got several TEXT entries which are either NULL or default to ''; > I did a small experiment: on 3.7.13 > CREATE TABLE container_stat ( > account TEXT, > container TEXT, > created_at TEXT, > put_timestamp TEXT DEFAULT '0', > delete_timestamp TEXT DEFAULT '0', > object_count INTEGER, > bytes_used INTEGER, > reported_put_timestamp TEXT DEFAULT '0', > reported_delete_timestamp TEXT DEFAULT '0', > reported_object_count INTEGER DEFAULT 0, > reported_bytes_used INTEGER DEFAULT 0, > hash TEXT default '00000000000000000000000000000000', > id TEXT, > status TEXT DEFAULT '', > status_changed_at TEXT DEFAULT '0', > metadata TEXT DEFAULT '', > x_container_sync_point1 INTEGER DEFAULT -1, > x_container_sync_point2 INTEGER DEFAULT -1); > insert into container_stat(id,status,status_changed_at) > values('id1','status1',''); > insert into container_stat(id,status,status_changed_at) > values('id2','status2',''); > insert into container_stat(id,status,status_changed_at) > values('id3','status3',''); > insert into container_stat(id,status,status_changed_at) > values('id4','status4',''); > insert into container_stat(id,status,status_changed_at) > values('id5','status5',''); > delete from container_stat where account='id1'; > insert into container_stat(account,status,status_changed_at) > values('id1','status1 change1 to something > else','status_changhed_at_1'); > > If you look at the order of data BEFORE the delete/insert occurs you > get this (using "strings"); > 000000000000000000000000000000000000id5status5 > 000000000000000000000000000000000000id4status4 > 000000000000000000000000000000000000id3status3 > 000000000000000000000000000000000000id2status2 > 000000000000000000000000000000000000id1status1 > > After a delete/insert where one of the fields grows in size you get > this: > id1000000000000000000000000000000000000status1 change1 to something > elsestatus_changed_at_1 > 000000000000000000000000000000000000id5status5 > 000000000000000000000000000000000000id4status4 > 000000000000000000000000000000000000id3status3 > 000000000000000000000000000000000000id2status2 > 000000000000000000000000000000000000id1status1 > > So...the probability of an update needing to move due to larger data > increases the closer you are to the beginning of the database. Each > update would reduce the likelihood of that record getting relocated > again as the fields grow in size. > [O'Toole, Eamonn] This definitely sounds like it could be an issue. There is just one container_stat entry (confirmed by sqlite3_analyzer output which I'll post later). So you're saying that the single container_stat table entry is potentially being relocated very frequently the closer the update is to the beginning of the db?
> If you use default values that are 2 sigma of the sizes of your strings > (or maybe just 2X the average length or so) you would reduce the head > seek time on updating records. > > It basically sounds possibly like your disk head is slewing from the > beginning of the file to the end on many of your updates. [O'Toole, Eamonn] What we do see is that the disk utilization is at 100% during updates to the beginning of the file, but the actual throughput in terms of MB/s is low - which tallies with your hypothesis. > > Also...would "insert or update" help you a bit? > http://www.sqlite.org/lang_conflict.html [O'Toole, Eamonn] I'll have a look at this, thanks. > > > > Michael D. Black > Senior Scientist > Advanced Analytics Directorate > Advanced GEOINT Solutions Operating Unit > Northrop Grumman Information Systems > > ________________________________________ > From: [email protected] [[email protected]] > on behalf of O'Toole, Eamonn [[email protected]] > Sent: Friday, November 09, 2012 5:07 AM > To: [email protected] > Subject: EXT :[sqlite] Anomalously slow performance on updates to early > entries in a DB > > Hello all, > > First of all, I'm a complete novice with respect to SQLite so apologies > if there is an obvious answer to my question. I've also posted this > question in IRC, and it was suggested that I post the question to this > mailing list. We're running a test-bed of an object store (Openstack > Swift) which uses SQLite to record information on the objects that are > stored in a container. The table itself (called "object") is very > simple, it contains the modification time of the object, the size of > object, the md5sum of the object, and the content-type of the object. > We are seeing a performance anomaly on updates to existing object > records in the SQLite DB. If the container DB is sufficiently large > (about 10 million objects, 3.3GB) then the time to update records at > the beginning of the database by order of entry is anomalously high. > The time is particularly bad for the first approx. 100K records, is > somewhat better for the next 900K records, and settles down to a > consistent average from > approx 1 million records on. If this consistent average time is > around 7 seconds for 10,000 updates, then we see times of about 170 > seconds for 10,000 updates on the first 100K records. We don't see > this anomalously high update time if we start the updates after the > first 1 million records. > > Note that table updates are performed by first DELETEing the entry and > then INSERTing the changed entry. > > Does anybody have any idea why we're seeing this behaviour, and what we > can do to fix it? > > Note that Chunky_Ks (IRC) has suggested that B-tree rebalancing could > be the fundamental issue. > > Thanks. > > > SQL Schema follows: > > 1 database: main > > Tables: > container_stat > incoming_sync > object > outgoing_sync > > container_stat: > CREATE TABLE container_stat ( > account TEXT, > container TEXT, > created_at TEXT, > put_timestamp TEXT DEFAULT '0', > delete_timestamp TEXT DEFAULT '0', > object_count INTEGER, > bytes_used INTEGER, > reported_put_timestamp TEXT DEFAULT '0', > reported_delete_timestamp TEXT DEFAULT '0', > reported_object_count INTEGER DEFAULT 0, > reported_bytes_used INTEGER DEFAULT 0, > hash TEXT default '00000000000000000000000000000000', > id TEXT, > status TEXT DEFAULT '', > status_changed_at TEXT DEFAULT '0', > metadata TEXT DEFAULT '', > x_container_sync_point1 INTEGER DEFAULT -1, > x_container_sync_point2 INTEGER DEFAULT -1 > > incoming_sync: > CREATE TABLE incoming_sync ( > remote_id TEXT UNIQUE, > sync_point INTEGER, > updated_at TEXT DEFAULT 0 > ); > CREATE TRIGGER incoming_sync_insert AFTER INSERT ON incoming_sync > BEGIN > UPDATE incoming_sync > SET updated_at = STRFTIME('%s', 'NOW') > WHERE ROWID = new.ROWID; > END; > CREATE TRIGGER incoming_sync_update AFTER UPDATE ON incoming_sync > BEGIN > UPDATE incoming_sync > SET updated_at = STRFTIME('%s', 'NOW') > WHERE ROWID = new.ROWID; > END; > > object: > CREATE TABLE object ( > ROWID INTEGER PRIMARY KEY AUTOINCREMENT, > name TEXT, > created_at TEXT, > size INTEGER, > content_type TEXT, > etag TEXT, > deleted INTEGER DEFAULT 0 > ); > CREATE INDEX ix_object_deleted_name ON object (deleted, name); > CREATE TRIGGER object_delete AFTER DELETE ON object > BEGIN > UPDATE container_stat > SET object_count = object_count - (1 - old.deleted), > bytes_used = bytes_used - old.size, > hash = chexor(hash, old.name, old.created_at); > END; > CREATE TRIGGER object_insert AFTER INSERT ON object > BEGIN > UPDATE container_stat > SET object_count = object_count + (1 - new.deleted), > bytes_used = bytes_used + new.size, > hash = chexor(hash, new.name, new.created_at); > END; > CREATE TRIGGER object_update BEFORE UPDATE ON object > BEGIN > SELECT RAISE(FAIL, 'UPDATE not allowed; DELETE and > INSERT'); > END; > > outgoing_sync: > CREATE TABLE outgoing_sync ( > remote_id TEXT UNIQUE, > sync_point INTEGER, > updated_at TEXT DEFAULT 0 > ); > CREATE TRIGGER outgoing_sync_insert AFTER INSERT ON outgoing_sync > BEGIN > UPDATE outgoing_sync > SET updated_at = STRFTIME('%s', 'NOW') > WHERE ROWID = new.ROWID; > END; > CREATE TRIGGER outgoing_sync_update AFTER UPDATE ON outgoing_sync > BEGIN > UPDATE outgoing_sync > SET updated_at = STRFTIME('%s', 'NOW') > WHERE ROWID = new.ROWID; > END; > > > > _______________________________________________ > sqlite-users mailing list > [email protected] > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ > sqlite-users mailing list > [email protected] > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

