Fwd: [sqlite] assertion failing in pager.c :(

2005-04-11 Thread Vineeth R Pillai

Note: forwarded message attached.
Could any body plz look into this matter and provide
me with some help :-) I have posted this once but
couldnt get any reply..

Hoping for a response from sqlite techies

regards 
vineeth


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com --- Begin Message ---
Hi,

Can anybody help me with this problem

I want to know why the assert statement

assert( pPg->nRef==0 || pPg->pgno==1 );

is written in the pager_playback_one_page() function
in the pager.c file(line no:573). That assertion is
failing when the memory is low. To be more explanative
while memory becomes low sqliteOswrite() fails and
tryies to roll back. Rollback inturn calls the above
mentioned function and then it fails at the assertion.
I dont see any need for that assertion. More over if u
can see, the assertion is commented in the latest
version sqlite 3.2.0.

I am using the version 2.8.15

A big thanks in advance :)

regards vineeth



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/ 
--- End Message ---


Re: [sqlite] 50MB Size Limit?

2005-04-11 Thread Gé Weijers
Jonathan Zdziarski wrote:

>
> D. Richard Hipp wrote:
>
>> Are you sure your users are not, in fact, filling up their disk
>> drives?
>
>
> nope, plenty of free space on the drives. The 50MB limit seems to be
> very exact as well...exactly 51,200,000 bytes. I'm stumped too.

Assuming your application is called by the mail system: your mail
delivery system may limit the size of files to 50MB using the
'setrlimit' system call. If you have source code you can check that easily.

The 'bash' shell shows all resource limits with the command 'ulimit -a',
maybe you can insert it in a script somewhere and find out what it's
actually set to.

BTW: sendmail (8.13.1) explicitly tries to set the file resource limit
to 'infinity', which may fail if the hard limit was lowered.

Gé


[sqlite] failed to work when running in non-ASCII directory?

2005-04-11 Thread chan wilson
Hi,
  I found this problem a long time ago, but cannot figure out why:
  Everytime I put sqlite(no matter sqlite3.exe/sqlite3.dll/ other wrappers 
like sqlitedb.dll/ litex dll ) in a directory that contains non-ASCII 
characters, it failed to construct a connection. But it works well in those 
directories that do NOT contain any non-ASCII characters.

  Anyone can tell me why?
_
与世界各地的朋友进行交流,免费下载 MSN Messenger:  
http://messenger.msn.com/cn 



Re: [sqlite] sqlite performance problem

2005-04-11 Thread Maksim Yevmenkin
Robert,

> [snip]
> 
> > i said i print these rows to /dev/null too in my perl code. plus the
> > perl code does some other things such as joining these rows with other
> > hashes and summing the numbers.
> 
> That's fine.  I was merely trying to account for the 50% speed difference
> between the two differing column tests, which has been accomplished.
> 
> > > As for the temp table ... I haven't tried this, but isn't
> > "temp" a reserved
> > > word in SQLite?  More importantly, you should be doing this
> > statement inside
> >
> > yes, it is. i really want to create 'temporary table' in memory. i was
> > really hoping it would speed things up.
> 
> I misread the statement, so ignore me on that part.  However, 339,000 rows
> into a temporary in-memory table ... I tried some experiments locally here
> and none of them took more than 2 seconds to execute.  Are you sure you're
> not using up all available memory, which is causing the system to hit the
> swapfile?  What does this same query look like when you drop the "temp" from
> the query?

the system has 1G of ram. i was "monitoring" sqlite3 memory usage with
'top'. the SIZE and RES did not exceed 30M. so i do not think the
memory is the issue here.

> time sqlite3 db 'create table foo as select * from data where a <= 18234721' 
> > /dev/null
22.06u 1.39s 0:27.75 84.5%

so pretty much the same time without 'temp'.

i'm starting to suspect disk. here is what i did. i created a separate
database with only one table. this table contains subset of 92
rows from original data table. it also has the same index on "a"
column, i.e. i did

> sqlite3 db1
sqlite> attach db as s;
sqlite> create table data as select * from s.data where a <= 18234721;
sqlite> create index data_by_a on data (a);

full scan

> time sqlite3 db1 'select n1 from data' > /dev/null
17.19u 0.55s 0:19.06 93.0%

"bad" index scan, because it is guaranteed then the table only has
keys that match "where"

> time sqlite3 db1 'select n1 from data where a <= 18234721' > /dev/null
25.73u 0.59s 0:28.37 92.7%

+10 seconds! is this the overhead of "indexed" scan? is this what it
really takes to seek back and forth between index and data? what am i
missing here?

thanks,
max


Re: [sqlite] sqlite performance problem

2005-04-11 Thread Maksim Yevmenkin
Robert,

> > time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
> > 26.15u 0.59s 0:27.00 99.0%
> >
> > time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
> > 26.04u 0.61s 0:26.91 99.0%
> >
> > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> > 12.22u 0.41s 0:12.67 99.6%
> >
> > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> > 12.01u 0.61s 0:12.64 99.8%
> >
> > so just by selecting two different rows (both next to each other). I
> > get another 50% time difference? sure the column types
> > are different,
> > and i can even understand that FLOAT might be 8 bytes and
> > INTEGER 4
> > bytes, but 50% time difference? it just cant be that
> > linear can it?
> 
> By doing these same two queries using select count(), you've proven my
> original theory that the time difference is due to the volume of bytes being
> transmitted to dev/null and NOT because of the actual data types of the
> columns or ANY OTHER FACTOR.  If you really want, change your pipe output to

ok, i agree, redirecting output to /dev/null might have impact on times. 

> two file names, I'm 100% sure you'll find that the 'select n1 ...' query
> results in an output file significantly larger than the 'select e ...'
> output file.  This is where the differing performance comes from -- the time
> it is taking to parse and print your output.

here you are wrong actually. 

> time sqlite3 db 'select e from data where a <= 18234721' > x
12.01u 0.64s 0:12.80 98.8%

> time sqlite3 db 'select n1 from data where a <= 18234721' > y
26.06u 0.62s 0:26.86 99.3%

the size of x is 1070681 (output of column e) and the size of y is
1004219 (output of column n1), so the file sizes are about the same. x
is 66462 bytes more.  it is probably possible that printf() is 2 times
slower on float's.

> The select count(*) does indeed use the index and hence the significant
> performance difference.

fine

> > fine, if i ask sqlite just to count the rows it wins hands-down, but i
> > really want these rows. even more i 'd like to then "natural join"
> > these rows with a couple of other tables to really do what the perl
> > code currently does.
> >
> > but, it takes 22 seconds to just to create a temp table with the
> > required dataset
> >
> > > time sqlite3 db 'create temp table foo as select * from
> > data where a <= 18234721' > /dev/null
> > 21.93u 0.89s 0:22.95 99.4%
> >
> > and i do not understand what i'm doing wrong here :(
> 
> Again, the only reason I suggested using count() in your timing test was to
> ensure that the command-line sqlite3 program's output was consistent for
> both tests and to eliminate dev/null printf's from factoring into the total
> time.  In your application, you'll call select * (or whatever) without the
> count to retrieve the rows -- but since you're not printf'ing them and
> instead are doing your own thing with them, you will indeed see close to
> identical times in your selects just like you did in the count() test.

i said i print these rows to /dev/null too in my perl code. plus the
perl code does some other things such as joining these rows with other
hashes and summing the numbers.

> As for the temp table ... I haven't tried this, but isn't "temp" a reserved
> word in SQLite?  More importantly, you should be doing this statement inside

yes, it is. i really want to create 'temporary table' in memory. i was
really hoping it would speed things up.

> a transaction.  Transactions are critically important in SQLite.  I would go
> so far as to say NEVER EVER perform ANY bulk write in SQLite outside a
> transaction!  The performance difference is beyond phenomenal.

nope.

> time sqlite3 db 'begin; create temp table foo as select * from data where a 
> <= 18234721; commit' > /dev/null
21.90u 0.77s 0:22.87 99.1%

still 22 seconds to just create a table with 300,000+ records, and
that is, unfortunately, too slow :(

doing strace/truss on sqlite shows that it performs huge amount of seek's. 

so the original questions stay:

- what am i doing wrong here?

- is sqlite going to be not as fast on a fairly large index'ed table
because it has to seek back and forth between index and data?

thanks,
max


RE: [sqlite] sqlite performance problem

2005-04-11 Thread Robert Simpson
Let's recap ...

> time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
> 26.15u 0.59s 0:27.00 99.0%
>
> time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
> 26.04u 0.61s 0:26.91 99.0%
>
> time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> 12.22u 0.41s 0:12.67 99.6%
>
> time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> 12.01u 0.61s 0:12.64 99.8%
>
> so just by selecting two different rows (both next to each other). I
> get another 50% time difference? sure the column types 
> are different,
> and i can even understand that FLOAT might be 8 bytes and 
> INTEGER 4
> bytes, but 50% time difference? it just cant be that 
> linear can it?

By doing these same two queries using select count(), you've proven my
original theory that the time difference is due to the volume of bytes being
transmitted to dev/null and NOT because of the actual data types of the
columns or ANY OTHER FACTOR.  If you really want, change your pipe output to
two file names, I'm 100% sure you'll find that the 'select n1 ...' query
results in an output file significantly larger than the 'select e ...'
output file.  This is where the differing performance comes from -- the time
it is taking to parse and print your output.

The select count(*) does indeed use the index and hence the significant
performance difference.  

> fine, if i ask sqlite just to count the rows it wins hands-down, but i
> really want these rows. even more i 'd like to then "natural join"
> these rows with a couple of other tables to really do what the perl
> code currently does.
> 
> but, it takes 22 seconds to just to create a temp table with the
> required dataset
> 
> > time sqlite3 db 'create temp table foo as select * from 
> data where a <= 18234721' > /dev/null
> 21.93u 0.89s 0:22.95 99.4%
> 
> and i do not understand what i'm doing wrong here :(

Again, the only reason I suggested using count() in your timing test was to
ensure that the command-line sqlite3 program's output was consistent for
both tests and to eliminate dev/null printf's from factoring into the total
time.  In your application, you'll call select * (or whatever) without the
count to retrieve the rows -- but since you're not printf'ing them and
instead are doing your own thing with them, you will indeed see close to
identical times in your selects just like you did in the count() test.

As for the temp table ... I haven't tried this, but isn't "temp" a reserved
word in SQLite?  More importantly, you should be doing this statement inside
a transaction.  Transactions are critically important in SQLite.  I would go
so far as to say NEVER EVER perform ANY bulk write in SQLite outside a
transaction!  The performance difference is beyond phenomenal.

Robert




Re: [sqlite] sqlite performance problem

2005-04-11 Thread Maksim Yevmenkin
Robert,

> > i guess, i can believe this. however its pretty disappointing to get
> > 50% improvement on 30 times less dataset :(
> >
> > but how do you explain this?
> >
> > sqlite> .schema data
> > CREATE TABLE data
> > (
> >a INTEGER,
> >b INTEGER,
> >c CHAR,
> >d INTEGER,
> >e INTEGER,
> >n1 FLOAT,
> >n2 FLOAT
> > );
> > CREATE INDEX data_by_a ON data (a);
> >
> > > time sqlite3 db 'select n1 from data where a <= 18234721' >
> > /dev/null
> > 26.15u 0.59s 0:27.00 99.0%
> > > time sqlite3 db 'select n1 from data where a <= 18234721' >
> > /dev/null
> > 26.04u 0.61s 0:26.91 99.0%
> >
> > and
> >
> > > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> > 12.22u 0.41s 0:12.67 99.6%
> > > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> > 12.01u 0.61s 0:12.64 99.8%
> >
> > so just by selecting two different rows (both next to each other). i
> > get another 50% time difference? sure the column types are different,
> > and i can even understand that FLOAT might be 8 bytes and INTEGER 4
> > bytes, but 50% time difference? it just cant be that linear can it?
> >
> > do you think it is possible to get better results by issuing multiple
> > queries each of which will return even less dataset? and, no, i'm not
> > a database guy.
> 
> The most glaring fault in your time tests that I see is that you're running
> the command-line sqlite3, which is printf'ing all 339,000+ rows to dev/null.
> It doesn't show on your screen, but the work is still being done regardless.

well, i do print these rows to /dev/null in perl test too.

> If you really want a fair speed test, change those statements to:
> 
> select count(n1) from data where a <= 18234721

> time sqlite3 db 'select count(n1) from data where a <= 18234721' > /dev/null
7.79u 0.70s 0:08.50 99.8%

> And
> 
> select count(e) from data where a <= 18234721

> time sqlite3 db 'select count(e) from data where a <= 18234721' > /dev/null
7.90u 0.42s 0:08.31 100.1%

> Or even
> 
> select count(*) from data where a <= 18234721

> time sqlite3 db 'select count(*) from data where a <= 18234721' > /dev/null
1.35u 0.16s 0:01.47 102.7%

8 times faster then count(n1) or count(e)? i'm confused. i guess it
just used "a" field (on which it had index?)

> THEN tell us what the difference in performance is ...

fine, if i ask sqlite just to count the rows it wins hands-down, but i
really want these rows. even more i 'd like to then "natural join"
these rows with a couple of other tables to really do what the perl
code currently does.

but, it takes 22 seconds to just to create a temp table with the
required dataset

> time sqlite3 db 'create temp table foo as select * from data where a <= 
> 18234721' > /dev/null
21.93u 0.89s 0:22.95 99.4%

and i do not understand what i'm doing wrong here :(

thanks,
max


RE: [sqlite] sqlite performance problem

2005-04-11 Thread Robert Simpson
> -Original Message-
> From: Maksim Yevmenkin [mailto:[EMAIL PROTECTED] 
> Sent: Monday, April 11, 2005 9:59 AM
> To: Christian Smith
> Cc: sqlite-users@sqlite.org
> Subject: Re: [sqlite] sqlite performance problem
> 
> i guess, i can believe this. however its pretty disappointing to get
> 50% improvement on 30 times less dataset :(
> 
> but how do you explain this?
> 
> sqlite> .schema data
> CREATE TABLE data
> (
>a INTEGER,
>b INTEGER,
>c CHAR,
>d INTEGER,
>e INTEGER,
>n1 FLOAT,
>n2 FLOAT
> );
> CREATE INDEX data_by_a ON data (a);
> 
> > time sqlite3 db 'select n1 from data where a <= 18234721' > 
> /dev/null
> 26.15u 0.59s 0:27.00 99.0%
> > time sqlite3 db 'select n1 from data where a <= 18234721' > 
> /dev/null
> 26.04u 0.61s 0:26.91 99.0%
> 
> and
> 
> > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> 12.22u 0.41s 0:12.67 99.6%
> > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
> 12.01u 0.61s 0:12.64 99.8%
> 
> so just by selecting two different rows (both next to each other). i
> get another 50% time difference? sure the column types are different,
> and i can even understand that FLOAT might be 8 bytes and INTEGER 4
> bytes, but 50% time difference? it just cant be that linear can it?
> 
> do you think it is possible to get better results by issuing multiple
> queries each of which will return even less dataset? and, no, i'm not
> a database guy.

The most glaring fault in your time tests that I see is that you're running
the command-line sqlite3, which is printf'ing all 339,000+ rows to dev/null.
It doesn't show on your screen, but the work is still being done regardless.
If you really want a fair speed test, change those statements to:

select count(n1) from data where a <= 18234721

And 

select count(e) from data where a <= 18234721

Or even

select count(*) from data where a <= 18234721

THEN tell us what the difference in performance is ...

Robert




Re: [sqlite] 50MB Size Limit?

2005-04-11 Thread Stefan Finzel
What about the os shells limit? Look at commands limit/ulimit/unlimit
G. Roderick Singleton wrote:
On Mon, 2005-04-11 at 12:05 -0400, Jonathan Zdziarski wrote:
 

D. Richard Hipp wrote:
   

Are you sure your users are not, in fact, filling up their disk
drives?
 

nope, plenty of free space on the drives. The 50MB limit seems to be 
very exact as well...exactly 51,200,000 bytes. I'm stumped too.
   

quotas?
 



Re: [sqlite] sqlite performance problem

2005-04-11 Thread Maksim Yevmenkin
Christian,

thanks for the reply.

> >i'm having strange performance problem with sqlite-3.2.0. consider the
> >following table
> >
> > [snip]
> >
> >now the problem:
> >
> >1) if i do a select with an idex it takes 27 sec. to get 92 rows
> >
> >> time sqlite3 db 'select n2 from data where a <= 18234721' > /dev/null
> >25.95u 0.71s 0:27.02 98.6%
> >
> >> time sqlite3 db 'select n2 from data where a <= 18234721' > /dev/null
> >26.02u 0.66s 0:27.53 96.9%
> >
> >2) if i do a select with sequential lookup it takes 1min to get 9818210 rows
> >
> >> time sqlite3 db 'select n2 from data where a+0 <= 18234721' > /dev/null
> >49.54u 14.65s 1:04.98 98.7%
> >
> >> time sqlite3 db 'select n2 from data where a+0 <= 18234721' > /dev/null
> >49.80u 14.42s 1:05.03 98.7%
> >
> >- so how come it take only twice as much time to get 30 times more rows?
> 
> When doing an index scan, you will be accessing the index as well as the
> table pages. For a sequential scan, only the table pages are touched,
> reducing thrashing of the cache.

i guess, i can believe this. however its pretty disappointing to get
50% improvement on 30 times less dataset :(

but how do you explain this?

sqlite> .schema data
CREATE TABLE data
(
   a INTEGER,
   b INTEGER,
   c CHAR,
   d INTEGER,
   e INTEGER,
   n1 FLOAT,
   n2 FLOAT
);
CREATE INDEX data_by_a ON data (a);

> time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
26.15u 0.59s 0:27.00 99.0%
> time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
26.04u 0.61s 0:26.91 99.0%

and

> time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
12.22u 0.41s 0:12.67 99.6%
> time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
12.01u 0.61s 0:12.64 99.8%

so just by selecting two different rows (both next to each other). i
get another 50% time difference? sure the column types are different,
and i can even understand that FLOAT might be 8 bytes and INTEGER 4
bytes, but 50% time difference? it just cant be that linear can it?

do you think it is possible to get better results by issuing multiple
queries each of which will return even less dataset? and, no, i'm not
a database guy.

> Also, if the data is not in index order in the table, and/or dispersed
> across the database file, you may have to visit each page more than once
> when traversing in index order. In the full table scan, you'll read the
> rows in table order, hence only touching each page once before moving on
> to later pages, thus reducing cache thrashing even more.

all rows were inserted into the table in order that matches the index.
i can not say if the records on disk will have the same order.

> >- and why is it taking 27 seconds to get 92 rows anyway?
> 
> You think 12347 rows/s is bad?

because i have the same data set in perl multilevel hash stored on
disk (storable format), and it takes about 1 minute to perform the
same query on the same hardware. this time includes reading the data
from the disk, traversing every single key on each hash level, etc. i
was hoping that sqlite  would do something like in just a few (< 10)
seconds.

> >- is there any way to create an index in separate file?
> 
> No, SQLite is a single file embedded database. Keeps administration
> almost non-existent (by design.)

- so, if i have a large table ( > 9 million records) with an index
then sqlite would have constantly seek back and forth between index
and data (within the same file)  probably wasting lots of disk
bandwidth?

- if above is correct than can i force sqlite to get all (or as much
as possible) of index into the memory?

> >the hardware is sun netra t1 running solaris 5.7. the db file size is
> >about 800 mbytes.
> >
> >just for the record i'd like to have at least 15 times more records in
> >the 'data' table.
> 
> If you can match SQLite for simple read-only throughput with another
> database, you'll be doing well.

i do not know about other database, but compared to plain perl its not
that impressive. i must be doing something wrong here.

thanks,
max


RE: [sqlite] 50MB Size Limit?

2005-04-11 Thread Brad DerManouelian
Mail system likely has a quota.

Check this link:
http://www.webservertalk.com/archive280-2004-6-280358.html 


-Original Message-
From: Jonathan Zdziarski [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 11, 2005 12:27 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] 50MB Size Limit?


G. Roderick Singleton wrote:
> quotas?

That crossed my mind, but all of these databases are being stored in
system space (/usr/local/var/dspam) and owned by the mail system.


Re: [sqlite] 50MB Size Limit?

2005-04-11 Thread Jonathan Zdziarski
G. Roderick Singleton wrote:
quotas?
That crossed my mind, but all of these databases are being stored in 
system space (/usr/local/var/dspam) and owned by the mail system.


Re: [sqlite] 50MB Size Limit?

2005-04-11 Thread G. Roderick Singleton
On Mon, 2005-04-11 at 12:05 -0400, Jonathan Zdziarski wrote:
> D. Richard Hipp wrote:
> > Are you sure your users are not, in fact, filling up their disk
> > drives?
> 
> nope, plenty of free space on the drives. The 50MB limit seems to be 
> very exact as well...exactly 51,200,000 bytes. I'm stumped too.

quotas?
-- 
G. Roderick Singleton <[EMAIL PROTECTED]>
PATH tech



Re: [sqlite] 50MB Size Limit?

2005-04-11 Thread D. Richard Hipp
On Mon, 2005-04-11 at 11:28 -0400, Jonathan Zdziarski wrote:
> Greetings!
> 
> I couldn't find any information on this via google or sqlite.org, so I'm 
> hoping someone can answer this for me.
> 
> We support SQLite v2.x and v3.x as storage backends in DSPAM. I've had a 
> lot of users complain that they get 'Database Full' errors once their 
> file hits 50MB in size, and so I'm wondering if there's a size 
> limitation I should know about, and if there's any possible workaround. 
> The most recent report of this is running 2.8.16. Will upgrading to 3.0 
> solve their problem?
> 

The size limits on SQLite v2 and v3 are like 4TiB - 2**42 bytes.
There are no 50MB limits.  Indeed, I create databases larger than
50MB on a regular basis.

Are you sure your users are not, in fact, filling up their disk
drives?
-- 
D. Richard Hipp <[EMAIL PROTECTED]>



[sqlite] 50MB Size Limit?

2005-04-11 Thread Jonathan Zdziarski
Greetings!
I couldn't find any information on this via google or sqlite.org, so I'm 
hoping someone can answer this for me.

We support SQLite v2.x and v3.x as storage backends in DSPAM. I've had a 
lot of users complain that they get 'Database Full' errors once their 
file hits 50MB in size, and so I'm wondering if there's a size 
limitation I should know about, and if there's any possible workaround. 
The most recent report of this is running 2.8.16. Will upgrading to 3.0 
solve their problem?

Thanks for the help!
Jonathan


Re: [sqlite] High throughput and durability

2005-04-11 Thread Andrew Piskorski
On Mon, Apr 11, 2005 at 03:59:56PM +0200, Thomas Steffen wrote:
> I have a problem where I need both a high throughput (10%
> write/delete, 90% read) and durability. My transactions are really
> simple, usually just a single write, delete or read, but it is
> essential that I know when a transaction is commited to disk, so that
> it would be durable after a crash.

Why do you want to do this with SQLite, rather than something like
PostgreSQL?  Sounds like you have both concurrent writers AND
concurrent readers, all at the same time, which is going to totally
hose your performance on SQLite.  Do you have some hard constraint
that requires an embedded in-process database library like SQLite,
rather than a client-server RDBMS?

Even if you MUST have an embeded db, I would still test against
PostgreSQL, as that should tell you whether MVCC can solve your
problems.  Embedded databases that support MVCC and/or other
techniques for much better concurrency do exist, you just might have
to pay for them.

You didn't mention your transaction rate, nor what your application
even is, but general purpose RDBMSs are specifically designed to
support transaction processing adequately, so unless your transaction
rates are truly huge, an RDBMS with MVCC (PostgreSQL, Oracle) would
probably work fine for you.

I suspect it's not your total transaction load that's a problem, it's
simply that SQLite doesn't support the concurrency you need.  Of
course, if that's the case, one solution would be to add MVCC support
to SQLite, as has been discussed on the list in the past.  That would
be cool.  :)

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/


Re: [sqlite] beat 120,000 inserts/sec

2005-04-11 Thread Thomas Steffen
On Apr 11, 2005 4:06 PM, Christian Smith <[EMAIL PROTECTED]> wrote:

> The test given is clearly CPU bound. All the big numbers are from people
> with big CPUs, with equally big RAM performance as well, probably.

I have done a few database test recently, and I often found them to be
CPU bound, at least as long as the data fits into memory.

If you are looking for a fast system, I think there is nothing
affordable that currently beats an AMD64 (Athlon64 or Opteron). The
build in memory controller leads to very low memory latency, which is
quite important for databases, and the 64bit CPU gives you lots of raw
power.

Thomas


Re: [sqlite] High throughput and durability

2005-04-11 Thread Thomas Steffen
On Apr 11, 2005 4:17 PM, Christian Smith <[EMAIL PROTECTED]> wrote:
> On Mon, 11 Apr 2005, Thomas Steffen wrote:

> >Is it possible to delay the fsync(), so that it
> >only occurs after 10 or 100 transactions?
>
> No.

Thought so, because the transaction log seems to happen at a low
level, close to the fsync().

> How about batch operations, so that if you get an error, you rollback the
> batch update, do only redo the updates that succeed up to that point, then
> handle the failed update in it's own transaction.

Yes, that is an option, but I either have to encapsulate the
transaction in a function, or expose the complexity to the rest of the
application. In both cases the use becomes more difficult than if I
could just "set" single rows.

> No. You would have to implement replication yourself using triggers maybe,
> or perhaps update the pager layer to synchronise database contents to a
> second file.

I would like to go with a solution on a higher level. That reduces the
risk of copying any corruption. Yes, I can give it a try on the
application level.

Thank's for your help,
Thomas


Re: [sqlite] High throughput and durability

2005-04-11 Thread Christian Smith
On Mon, 11 Apr 2005, Witold Czarnecki wrote:

>rsync could be better.


Neither would do a good job if the database contents change while you're
copying it. There be pain and corruption.

The safest way to take a snapshot is to use the sqlite shell .dump
command, and feed the output of that to another sqlite shell to recreate
the database. Otherwise, you must ensure that the database file is NOT in
use, and no rollback journal exists.


>
>Best Regards,
>Witold
>
>>>
>>> And is there a way to automatically replicate the database to a second
>>> system?
>>
>> Copying the database file should give you an exact replica.
>>
>

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] High throughput and durability

2005-04-11 Thread Christian Smith
On Mon, 11 Apr 2005, Thomas Steffen wrote:

>I have a problem where I need both a high throughput (10%
>write/delete, 90% read) and durability. My transactions are really
>simple, usually just a single write, delete or read, but it is
>essential that I know when a transaction is commited to disk, so that
>it would be durable after a crash.
>
>I can see that sqlite does an fsync() after each COMMIT, so a naive
>implementation give *very* bad performance. I could severeal
>operations into one transaction, reducing the amout of time waiting
>for fsync() to finish, but I am not sure whether that is the most
>efficient solution. Is it possible to delay the fsync(), so that it
>only occurs after 10 or 100 transactions?


No.


>
>The reason I ask is that I certainly don't want to roll back, if one
>operation fails, because the operations are basically independent of
>each other. And it may be more efficient if the transaction size stays
>small.
>
>Ideas?


How about batch operations, so that if you get an error, you rollback the
batch update, do only redo the updates that succeed up to that point, then
handle the failed update in it's own transaction. So long as you do the
updates in order, you should have a consistent view at all times.


>
>And is there a way to automatically replicate the database to a second system?


No. You would have to implement replication yourself using triggers maybe,
or perhaps update the pager layer to synchronise database contents to a
second file. But you'll be on your own.

What would the replica be used for? Does it need to be up to date at all
times?


>
>Thomas
>

Christian

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] High throughput and durability

2005-04-11 Thread Witold Czarnecki
rsync could be better.
Best Regards,
Witold
And is there a way to automatically replicate the database to a second 
system?
Copying the database file should give you an exact replica.



Re: [sqlite] High throughput and durability

2005-04-11 Thread Cory Nelson
On Apr 11, 2005 6:59 AM, Thomas Steffen <[EMAIL PROTECTED]> wrote:
> I have a problem where I need both a high throughput (10%
> write/delete, 90% read) and durability. My transactions are really
> simple, usually just a single write, delete or read, but it is
> essential that I know when a transaction is commited to disk, so that
> it would be durable after a crash.

I actually have the same problem.  The data isn't important, so my
solution was to buffer the data and write it every 30sec.  I couldn't
find a better way :/

> I can see that sqlite does an fsync() after each COMMIT, so a naive
> implementation give *very* bad performance. I could severeal
> operations into one transaction, reducing the amout of time waiting
> for fsync() to finish, but I am not sure whether that is the most
> efficient solution. Is it possible to delay the fsync(), so that it
> only occurs after 10 or 100 transactions?
> 
> The reason I ask is that I certainly don't want to roll back, if one
> operation fails, because the operations are basically independent of
> each other. And it may be more efficient if the transaction size stays
> small.
> 
> Ideas?
> 
> And is there a way to automatically replicate the database to a second system?

Copying the database file should give you an exact replica.

> Thomas
> 


-- 
Cory Nelson
http://www.int64.org


Re: [sqlite] beat 120,000 inserts/sec

2005-04-11 Thread Christian Smith
On Sat, 9 Apr 2005, Al Danial wrote:

>On Apr 9, 2005 12:43 AM, Andy Lutomirski <[EMAIL PROTECTED]> wrote:
>> Al Danial wrote:
>> > The attached C program measures insert performance for populating
>> > a table with an integer and three random floating point values with
>> > user defined transaction size.  Usage is:
>> >
>> > ./sqlite_insert 
>>
>> All of these are on Gentoo, Athlon 64 3200+, running 64 bit.
>>
>> Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two
>> slave drives on different channels):
>>
>> $ ./sqlitetest 10 5
>>   10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s
>> $ ./sqlitetest 200 5
>>   200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s
>
>That's impressive.  Clearly a well-implemented RAID0 configuration
>and fast disks make a huge difference.  This will be the direction I'll
>take.


My machine, 2x Athlon XP 1700+, 512MB RAM, U160 SCSI (3x10K disks):
[EMAIL PROTECTED] tmp]$ ./sqlite_insert 10 5
 10 inserts to ./a.db in 2.495 s = 40081.96 inserts/s
[EMAIL PROTECTED] tmp]$ ./sqlite_insert 200 5
 200 inserts to ./a.db in 47.068 s = 42491.71 inserts/s

While I have a big fat SCSI IO system, vmstat during the run shows only
shows about 2MB/s ever going to the HD, well within the capabilities of
any SCSI or IDE HD (Laptop even!)
[EMAIL PROTECTED] csmith]$ vmstat 1
procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 3  0  93296 116156  30424 22960400 914   6231 96  3  1  0
 3  0  93296 114308  30484 23154800 0  2167  333  1194 99  1  0  0
 7  0  93296 112488  30512 23348800 0  2064  320  1092 99  1  0  0
 4  0  93296 110716  30540 23543200 0  2072  314  1060 99  1  0  0
 4  0  93296 108844  30568 23750400   128  2068  319  1069 99  1  0  0
 4  0  93296 107380  30596 23944400 0  2076  383  1252 98  2  0  0
 2  2  93296 103352  30640 24139200 0  2101  321   96  4  0  0



>
>Thanks to everyone who posted performance numbers and machine
>setup info.  Some results were counterintuitive (I'd have guessed
>SCSI drives would come out on top) but many variables are at work
>so I won't try to draw too many conclusions.


The test given is clearly CPU bound. All the big numbers are from people
with big CPUs, with equally big RAM performance as well, probably.


>Also thanks to the suggestions to study the pragma's.  I did try
>SYNCHRONOUS=off but that didn't seem to have an effect; I'll
>study the docs to make sure I've got it right. -- Al


For CPU bound work, it won't. You appear to be maxing out what the CPU and
memory subsystem can handle. Tweaking IO will not help. Using ":memory:"
gave about a 10% speedup, so that looks like the maximum amount of speedup
you could possibly hope for.


Christian


-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


[sqlite] High throughput and durability

2005-04-11 Thread Thomas Steffen
I have a problem where I need both a high throughput (10%
write/delete, 90% read) and durability. My transactions are really
simple, usually just a single write, delete or read, but it is
essential that I know when a transaction is commited to disk, so that
it would be durable after a crash.

I can see that sqlite does an fsync() after each COMMIT, so a naive
implementation give *very* bad performance. I could severeal
operations into one transaction, reducing the amout of time waiting
for fsync() to finish, but I am not sure whether that is the most
efficient solution. Is it possible to delay the fsync(), so that it
only occurs after 10 or 100 transactions?

The reason I ask is that I certainly don't want to roll back, if one
operation fails, because the operations are basically independent of
each other. And it may be more efficient if the transaction size stays
small.

Ideas? 

And is there a way to automatically replicate the database to a second system? 

Thomas