Re: [sqlite] Insert speed greatly decreasing over time
Hello! But why do you not compress big text strings? And index size can be reduced by using md5 hash of text key field. See the extensions http://mobigroup.ru/files/sqlite-ext/ http://mobigroup.ru/files/sqlite-ext/md5/ Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
I think I may now understand the problem we're seeing. I left out a very important piece of information: On the dbs we're seeing the increase in insert time, the average row size is large - really large. Basically we can see averages as high as 45k. This was causing overflow chains to be as high in some dbs as 14 pages long. I think it is pretty obvious why performance starts out good but then goes south. When the overflow chains are contiguous in general, all is good. But over time, with chain lengths that long, repeated updates of rows causes them to get spread all over the db. So, in other words, "no duh!" For now, moving the page size to 32k for databases like this seems to be doing the trick. We're watching to see what happens, but performance is much better and for now seems to be holding steady. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Alexey Pechnikov Sent: Wednesday, October 07, 2009 3:33 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Insert speed greatly decreasing over time Hello! Because update == delete + insert you must have fragmentation. The degree of the fragmentation is proporsional to count of replaces. So you may have problems after inserting a lot of dublicates. Are you really need to insert or update? 1. I'm using few hundread MB databases like CREATE TABLE telephony_log ( ... unique (nas_name,port,duration,origin,date_start) on conflict ignore ); Performance of "ignore" conflict resolution is more better. 2. May be unique index can be better than text PK 3. May be "insert into t select * from temp_t" can be helpful 4. You can mark records as deleted without deleting it immediate. And delete old records periodically whith vacuum after this operation. == PRAGMA auto_vacuum=0; pragma default_cache_size=20; CREATE TABLE t (k varchar(50) not null, d text not null, e datetime,is_new int default 1); create index t_k_idx on t(k); begin; CREATE TEMP TABLE temp_t (k varchar(50) primary key not null, d text not null, e datetime,is_new int default 1); insert into temp_t ... update t set is_new=0 where k in (select k from temp_t); insert into t select * from temp_t; commit; By cron: delete from t where is_new=0; vacuum; Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
Hello! Because update == delete + insert you must have fragmentation. The degree of the fragmentation is proporsional to count of replaces. So you may have problems after inserting a lot of dublicates. Are you really need to insert or update? 1. I'm using few hundread MB databases like CREATE TABLE telephony_log ( ... unique (nas_name,port,duration,origin,date_start) on conflict ignore ); Performance of "ignore" conflict resolution is more better. 2. May be unique index can be better than text PK 3. May be "insert into t select * from temp_t" can be helpful 4. You can mark records as deleted without deleting it immediate. And delete old records periodically whith vacuum after this operation. == PRAGMA auto_vacuum=0; pragma default_cache_size=20; CREATE TABLE t (k varchar(50) not null, d text not null, e datetime,is_new int default 1); create index t_k_idx on t(k); begin; CREATE TEMP TABLE temp_t (k varchar(50) primary key not null, d text not null, e datetime,is_new int default 1); insert into temp_t ... update t set is_new=0 where k in (select k from temp_t); insert into t select * from temp_t; commit; By cron: delete from t where is_new=0; vacuum; Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
I have not done that yet; I'll get it into that state then attach to sqlite3's command line tool and see if I can learn something. I will let you know what I find. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov Sent: Wednesday, October 07, 2009 2:23 PM To: pechni...@sandy.ru; General Discussion of SQLite Database Subject: Re: [sqlite] Insert speed greatly decreasing over time > Try this: > pragma cache_size=20; I believe changing pragma cache_size will not help, because size of the database doesn't change, so cache hit ratio doesn't change too. And there's no disk i/o suggesting that it's not the bottleneck. Chris, did you try to attach to the process with some debugger to see what it is doing (at least in what function it's located) when there's no cpu and i/o? Pavel On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov <pechni...@mobigroup.ru> wrote: > Hello! > > Try this: > pragma cache_size=20; > > Best regards, Alexey Pechnikov. > http://pechnikov.tel/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
> Try this: > pragma cache_size=20; I believe changing pragma cache_size will not help, because size of the database doesn't change, so cache hit ratio doesn't change too. And there's no disk i/o suggesting that it's not the bottleneck. Chris, did you try to attach to the process with some debugger to see what it is doing (at least in what function it's located) when there's no cpu and i/o? Pavel On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikovwrote: > Hello! > > Try this: > pragma cache_size=20; > > Best regards, Alexey Pechnikov. > http://pechnikov.tel/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
Would dropping and re-creating an index help? On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikovwrote: > Hello! > > Try this: > pragma cache_size=20; > > Best regards, Alexey Pechnikov. > http://pechnikov.tel/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- VerifEye Technologies Inc. 905-948-0015x245 7100 Warden Ave, Unit 3 Markham ON, L3R 8B5 Canada ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
Hello! Try this: pragma cache_size=20; Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Insert speed greatly decreasing over time
We have an application that uses a single table database, that is quite simple. Here is the schema: CREATE TABLE t (k varchar(50) primary key not null, d text not null, e datetime) We receive data over the course of an hour that is in a different form and we need to put into the db. The order of the data is random and it is bursty. The set of keys is stable in general. Throughout the course of an hour, all rows are eventually replaced. Currently, we use 3.6.16. We batch about 200 or so updates into a transaction. We use INSERT OR UPDATE to modify the data. The db grows to about 450mb; Page Size is 4096. When we start with an empty db the average insert time is around 0.3s/transaction. When the db is full, it rises to a high of 0.5s/transaction... for a while. Then, magically, after a few hours, that time jumps to 10s a transaction. We converted things to csvs and so forth to test through the sqlite command line client, and get the same results. All db access time becomes very slow from any tool. Using the analyzer, we see that when the db is working well, the index fragmentation is around 5% or less... then all of a sudden, we get 95-100% fragmentation of the index. When that happens, we get the horrible insert times, irrespective, it seems, of table fragmentation. The db can be full and have 10s of thousands of updates before this jump occurs.If I vacuum the db, of course everything works smooth, and it seems to take a much longer time for the fragmentation to occur. Currently, I am testing a model where we UPDATE (then INSERT if rows affected = 0) to see if this reduces the horrible index fragmentation we see. This method should cause less index stress since we don't delete then insert, nor modify keys in general. The question I have is: Is this normal? How do I prevent this massive random fragmentation? Vacuum is not a solution - the rate at which we'd have to vacuum seems unreasonable and we can't just delete the dbs and start over. The data needs to be present and speedily accessible in general; vacuum causes problems in this respect. I have seen another thread on this from a few months ago where someone saw this behavior then it seemed to go away. During the slowness of the db, one thing we do notice is that zero cpu is being used and almost no i/o is going on. Disk TPS is also very very low. When running the sqlite3 command line client by itself against one of these highly fragmented dbs shows this behavior. For instance a pragma integrity_check takes minutes to run. No i/o or cpu at all for a minute or two, then a big burst. Same with inserting through sqlite3 client -- when fragmentation of the index is low, it inserts very fast. When its high, it inserts slow, and we see trivial disk i/o / tps and no cpu. And yes, as stated earlier, vacuum fixes it.. for a short while. We see this behavior across multiple machines with different motherboards, cpus, etc. How can our pk index fragment so badly so quickly? Will the update method work better for this than insert or replace? Any help/thoughts on this would be appreciated. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users