Re: [sqlite] Insert speed greatly decreasing over time

2009-10-08 Thread Alexey Pechnikov
Hello!

But why do you not compress big text strings? And index 
size can be reduced by using md5 hash of text key field.

See the extensions
http://mobigroup.ru/files/sqlite-ext/
http://mobigroup.ru/files/sqlite-ext/md5/

Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Insert speed greatly decreasing over time

2009-10-08 Thread McClellen, Chris
I think I may now understand the problem we're seeing.  I left out a
very important piece of information:  On the dbs we're seeing the
increase in insert time, the average row size is large - really large.
Basically we can see averages as high as 45k.  This was causing overflow
chains to be as high in some dbs as 14 pages long.  I think it is pretty
obvious why performance starts out good but then goes south.  When the
overflow chains are contiguous in general, all is good.  But over time,
with chain lengths that long, repeated updates of rows causes them to
get spread all over the db.  So, in other words, "no duh!"

For now, moving the page size to 32k for databases like this seems to be
doing the trick.  We're watching to see what happens, but performance is
much better and for now seems to be holding steady.



-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Alexey Pechnikov
Sent: Wednesday, October 07, 2009 3:33 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Insert speed greatly decreasing over time

Hello!

Because
update == delete + insert
you must have fragmentation. The degree of the fragmentation
is proporsional to count of replaces. So you may have problems
after inserting a lot of dublicates.

Are you really need to insert or update? 

1. I'm using few hundread MB databases like
CREATE TABLE telephony_log (
...
  unique (nas_name,port,duration,origin,date_start) on conflict ignore
);
Performance of "ignore" conflict resolution is more better.

2. May be unique index can be better than text PK

3. May be "insert into t select * from temp_t" can be helpful

4. You can mark records as deleted without deleting it immediate. And
delete old records periodically whith vacuum after this operation.

==
PRAGMA auto_vacuum=0;
pragma default_cache_size=20;
CREATE TABLE t (k varchar(50) not null, d text not null, e
datetime,is_new int 
default 1);
create index t_k_idx on t(k);

begin;
CREATE TEMP TABLE temp_t (k varchar(50) primary key not null, d text not
null,
e datetime,is_new int default 1);
insert into temp_t ...
update t set is_new=0 where k in (select k from temp_t);
insert into t select * from temp_t;
commit;

By cron:
delete from t where is_new=0;
vacuum;

Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Insert speed greatly decreasing over time

2009-10-07 Thread Alexey Pechnikov
Hello!

Because
update == delete + insert
you must have fragmentation. The degree of the fragmentation
is proporsional to count of replaces. So you may have problems
after inserting a lot of dublicates.

Are you really need to insert or update? 

1. I'm using few hundread MB databases like
CREATE TABLE telephony_log (
...
  unique (nas_name,port,duration,origin,date_start) on conflict ignore
);
Performance of "ignore" conflict resolution is more better.

2. May be unique index can be better than text PK

3. May be "insert into t select * from temp_t" can be helpful

4. You can mark records as deleted without deleting it immediate. And
delete old records periodically whith vacuum after this operation.

==
PRAGMA auto_vacuum=0;
pragma default_cache_size=20;
CREATE TABLE t (k varchar(50) not null, d text not null, e datetime,is_new int 
default 1);
create index t_k_idx on t(k);

begin;
CREATE TEMP TABLE temp_t (k varchar(50) primary key not null, d text not null,
e datetime,is_new int default 1);
insert into temp_t ...
update t set is_new=0 where k in (select k from temp_t);
insert into t select * from temp_t;
commit;

By cron:
delete from t where is_new=0;
vacuum;

Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Insert speed greatly decreasing over time

2009-10-07 Thread McClellen, Chris
I have not done that yet;  I'll get it into that state then attach to
sqlite3's command line tool and see if I can learn something.  I will
let you know what I find.

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov
Sent: Wednesday, October 07, 2009 2:23 PM
To: pechni...@sandy.ru; General Discussion of SQLite Database
Subject: Re: [sqlite] Insert speed greatly decreasing over time

> Try this:
> pragma cache_size=20;

I believe changing pragma cache_size will not help, because size of
the database doesn't change, so cache hit ratio doesn't change too.
And there's no disk i/o suggesting that it's not the bottleneck.

Chris, did you try to attach to the process with some debugger to see
what it is doing (at least in what function it's located) when there's
no cpu and i/o?

Pavel

On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov
<pechni...@mobigroup.ru> wrote:
> Hello!
>
> Try this:
> pragma cache_size=20;
>
> Best regards, Alexey Pechnikov.
> http://pechnikov.tel/
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Insert speed greatly decreasing over time

2009-10-07 Thread Pavel Ivanov
> Try this:
> pragma cache_size=20;

I believe changing pragma cache_size will not help, because size of
the database doesn't change, so cache hit ratio doesn't change too.
And there's no disk i/o suggesting that it's not the bottleneck.

Chris, did you try to attach to the process with some debugger to see
what it is doing (at least in what function it's located) when there's
no cpu and i/o?

Pavel

On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov  wrote:
> Hello!
>
> Try this:
> pragma cache_size=20;
>
> Best regards, Alexey Pechnikov.
> http://pechnikov.tel/
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Insert speed greatly decreasing over time

2009-10-07 Thread Adam DeVita
Would dropping and re-creating an index help?

On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov wrote:

> Hello!
>
> Try this:
> pragma cache_size=20;
>
> Best regards, Alexey Pechnikov.
> http://pechnikov.tel/
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
VerifEye Technologies Inc.
905-948-0015x245
7100 Warden Ave, Unit 3
Markham ON, L3R 8B5
Canada
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Insert speed greatly decreasing over time

2009-10-07 Thread Alexey Pechnikov
Hello!

Try this:
pragma cache_size=20;

Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Insert speed greatly decreasing over time

2009-10-07 Thread McClellen, Chris
We have an application that uses a single table database, that is quite
simple.  Here is the schema:

CREATE TABLE t (k varchar(50) primary key not null, d text not null, e
datetime)

We receive data over the course of an hour that is in a different form
and we need to put into the db.  The order of the data is random and it
is bursty.  The set of keys is stable in general.  Throughout the course
of an hour, all rows are eventually replaced.

Currently, we use 3.6.16.  We batch about 200 or so updates into a
transaction.  We use INSERT OR UPDATE to modify the data.  The db grows
to about 450mb; Page Size is 4096.  When we start with an empty db the
average insert time is around 0.3s/transaction.  When the db is full, it
rises to a high of 0.5s/transaction... for a while.  Then, magically,
after a few hours, that time jumps to 10s a transaction.  We converted
things to csvs and so forth to test through the sqlite command line
client, and get the same results.  All db access time becomes very slow
from any tool.

Using the analyzer, we see that when the db is working well, the index
fragmentation is around 5% or less... then all of a sudden, we get
95-100% fragmentation of the index.  When that happens, we get the
horrible insert times, irrespective, it seems, of table fragmentation.
The db can be full and have 10s of thousands of updates before this jump
occurs.If I vacuum the db, of course everything works smooth, and it
seems to take a much longer time for the fragmentation to occur.  

Currently, I am testing a model where we UPDATE (then INSERT if rows
affected = 0) to see if this reduces the horrible index fragmentation we
see.  This method should cause less index stress since we don't delete
then insert, nor modify keys in general.


The question I have is:  Is this normal?  How do I prevent this massive
random fragmentation?  Vacuum is not a solution - the rate at which we'd
have to vacuum seems unreasonable and we can't just delete the dbs and
start over.  The data needs to be present and speedily accessible in
general; vacuum causes problems in this respect.  

I have seen another thread on this from a few months ago where someone
saw this behavior then it seemed to go away.  During the slowness of the
db, one thing we do notice is that zero cpu is being used and almost no
i/o is going on.  Disk TPS is also very very low.  When running the
sqlite3 command line client by itself against one of these highly
fragmented dbs shows this behavior.  For instance a pragma
integrity_check takes minutes to run.  No i/o or cpu at all for a minute
or two, then a big burst.  Same with inserting through sqlite3 client --
when fragmentation of the index is low, it inserts very fast.  When its
high, it inserts slow, and  we see trivial disk i/o / tps and no cpu.
And yes, as stated earlier, vacuum fixes it.. for a short while.

We see this behavior across multiple machines with different
motherboards, cpus, etc.  How can our pk index fragment so badly so
quickly?  Will the update method work better for this than insert or
replace?

Any help/thoughts on this would be appreciated.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users