Re: [sqlite] High throughput and durability

2005-04-11 Thread Andrew Piskorski
On Mon, Apr 11, 2005 at 03:59:56PM +0200, Thomas Steffen wrote:
> I have a problem where I need both a high throughput (10%
> write/delete, 90% read) and durability. My transactions are really
> simple, usually just a single write, delete or read, but it is
> essential that I know when a transaction is commited to disk, so that
> it would be durable after a crash.

Why do you want to do this with SQLite, rather than something like
PostgreSQL?  Sounds like you have both concurrent writers AND
concurrent readers, all at the same time, which is going to totally
hose your performance on SQLite.  Do you have some hard constraint
that requires an embedded in-process database library like SQLite,
rather than a client-server RDBMS?

Even if you MUST have an embeded db, I would still test against
PostgreSQL, as that should tell you whether MVCC can solve your
problems.  Embedded databases that support MVCC and/or other
techniques for much better concurrency do exist, you just might have
to pay for them.

You didn't mention your transaction rate, nor what your application
even is, but general purpose RDBMSs are specifically designed to
support transaction processing adequately, so unless your transaction
rates are truly huge, an RDBMS with MVCC (PostgreSQL, Oracle) would
probably work fine for you.

I suspect it's not your total transaction load that's a problem, it's
simply that SQLite doesn't support the concurrency you need.  Of
course, if that's the case, one solution would be to add MVCC support
to SQLite, as has been discussed on the list in the past.  That would
be cool.  :)

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/


Re: [sqlite] High throughput and durability

2005-04-11 Thread Thomas Steffen
On Apr 11, 2005 4:17 PM, Christian Smith <[EMAIL PROTECTED]> wrote:
> On Mon, 11 Apr 2005, Thomas Steffen wrote:

> >Is it possible to delay the fsync(), so that it
> >only occurs after 10 or 100 transactions?
>
> No.

Thought so, because the transaction log seems to happen at a low
level, close to the fsync().

> How about batch operations, so that if you get an error, you rollback the
> batch update, do only redo the updates that succeed up to that point, then
> handle the failed update in it's own transaction.

Yes, that is an option, but I either have to encapsulate the
transaction in a function, or expose the complexity to the rest of the
application. In both cases the use becomes more difficult than if I
could just "set" single rows.

> No. You would have to implement replication yourself using triggers maybe,
> or perhaps update the pager layer to synchronise database contents to a
> second file.

I would like to go with a solution on a higher level. That reduces the
risk of copying any corruption. Yes, I can give it a try on the
application level.

Thank's for your help,
Thomas


Re: [sqlite] High throughput and durability

2005-04-11 Thread Christian Smith
On Mon, 11 Apr 2005, Witold Czarnecki wrote:

>rsync could be better.


Neither would do a good job if the database contents change while you're
copying it. There be pain and corruption.

The safest way to take a snapshot is to use the sqlite shell .dump
command, and feed the output of that to another sqlite shell to recreate
the database. Otherwise, you must ensure that the database file is NOT in
use, and no rollback journal exists.


>
>Best Regards,
>Witold
>
>>>
>>> And is there a way to automatically replicate the database to a second
>>> system?
>>
>> Copying the database file should give you an exact replica.
>>
>

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] High throughput and durability

2005-04-11 Thread Christian Smith
On Mon, 11 Apr 2005, Thomas Steffen wrote:

>I have a problem where I need both a high throughput (10%
>write/delete, 90% read) and durability. My transactions are really
>simple, usually just a single write, delete or read, but it is
>essential that I know when a transaction is commited to disk, so that
>it would be durable after a crash.
>
>I can see that sqlite does an fsync() after each COMMIT, so a naive
>implementation give *very* bad performance. I could severeal
>operations into one transaction, reducing the amout of time waiting
>for fsync() to finish, but I am not sure whether that is the most
>efficient solution. Is it possible to delay the fsync(), so that it
>only occurs after 10 or 100 transactions?


No.


>
>The reason I ask is that I certainly don't want to roll back, if one
>operation fails, because the operations are basically independent of
>each other. And it may be more efficient if the transaction size stays
>small.
>
>Ideas?


How about batch operations, so that if you get an error, you rollback the
batch update, do only redo the updates that succeed up to that point, then
handle the failed update in it's own transaction. So long as you do the
updates in order, you should have a consistent view at all times.


>
>And is there a way to automatically replicate the database to a second system?


No. You would have to implement replication yourself using triggers maybe,
or perhaps update the pager layer to synchronise database contents to a
second file. But you'll be on your own.

What would the replica be used for? Does it need to be up to date at all
times?


>
>Thomas
>

Christian

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] High throughput and durability

2005-04-11 Thread Witold Czarnecki
rsync could be better.
Best Regards,
Witold
And is there a way to automatically replicate the database to a second 
system?
Copying the database file should give you an exact replica.



Re: [sqlite] High throughput and durability

2005-04-11 Thread Cory Nelson
On Apr 11, 2005 6:59 AM, Thomas Steffen <[EMAIL PROTECTED]> wrote:
> I have a problem where I need both a high throughput (10%
> write/delete, 90% read) and durability. My transactions are really
> simple, usually just a single write, delete or read, but it is
> essential that I know when a transaction is commited to disk, so that
> it would be durable after a crash.

I actually have the same problem.  The data isn't important, so my
solution was to buffer the data and write it every 30sec.  I couldn't
find a better way :/

> I can see that sqlite does an fsync() after each COMMIT, so a naive
> implementation give *very* bad performance. I could severeal
> operations into one transaction, reducing the amout of time waiting
> for fsync() to finish, but I am not sure whether that is the most
> efficient solution. Is it possible to delay the fsync(), so that it
> only occurs after 10 or 100 transactions?
> 
> The reason I ask is that I certainly don't want to roll back, if one
> operation fails, because the operations are basically independent of
> each other. And it may be more efficient if the transaction size stays
> small.
> 
> Ideas?
> 
> And is there a way to automatically replicate the database to a second system?

Copying the database file should give you an exact replica.

> Thomas
> 


-- 
Cory Nelson
http://www.int64.org