[sqlite] Distributed transaction best practices

David Barrett Sun, 11 May 2008 23:26:23 -0700

What are the recommended "best practices" around using SQLite in a 
distributed scenario?


(I saw Shailesh's email in March; I'm considering building a two-phase 
commit layer atop SQLite nodes and looking for advice.)

Clearly, this is a big topic including failover, recovery, 
load-balancing, and so on.  But right now I'm most curious about the 
transaction layer.

The most obvious approach is to just process all transactions with a 
basic serialized two-phase commit protocol:

1) Master begins transaction in the local database
2) Master sends query to all slaves
3) Slaves begin transaction in their local database
4) Slaves send approval to the master
5) Master commits to the local database
6) Master sends commit command to all slaves
7) Slaves commit the transaction
8) Goto 1

This works great and is what I have now.

But this can only process one transaction at a time, and the commit 
speed is no faster than the slowest slave latency.  I'm curious if you 
can recommend a better way?  (I'm sure this is well documented 
somewhere; any suggestions?)

For example, it seems that transactions could be "pipelined" to decouple 
throughput from latency: after all, barring edge conditions, every slave 
should approve every transaction.  (And any failed transaction would 
probably result in all future transactions failing as well -- eg disk 
fail/full/corrupt.)

So the master could send a constant stream of transactions, and the 
slave could send a constant stream of approvals, and everybody is happy. 
  The only tricky case is in the rare case the slave actually does need 
to abort the transaction.

Assuming there are 10 transactions that haven't yet been committed, the 
slave could commit the first 3 and then abort the remaining 7.  This 
would look like:

1) Master begins 10 transaction in the local database
2) Master sends 10 queries to slave
3) Slave begins 3 transactions successfully, but the next 7 fail.
4) So the slave send 3 approval messages, and then 7 abort messages
5) Master commits the first 3, and then rolls back the remaining 7
6) Master sends commit messages for the first 3, and rollbacks for 7
7) Slave commits the first 3, and rolls back the last 7

Seems pretty straightforward in theory.  But my question is how to 
actually accomplish step 5 with SQLite.

I read a big discussion on nested transactions in the list archives from 
a year ago, and I understand they're not currently supported.  But I'm 
not sure what I want to do is actually nested transactions (else I'd 
nest infinitely deep before committing anything).

Rather, I'm thinking of some kind of... serial commits, or savepoints or 
something.  Basically, maintain a set of "savepoints" that I can revert 
the entire database to at any time.  Then I just commit transactions in 
order, and create a new "savepoint" after each.  Later, when I get the 
approval from the slave, I delete the associated savepoint (to save 
resources, presumably).  Alternatively, if I get a rollback from a 
slave, then I just revert to the savepoint before that transaction and 
-- effect -- rollback every transaction made after.

Can you think of any good way to accomplish this?

One horrifically bad way would be to copy the database after each 
transaction.

One less horrible way would be to manually maintain an undo log (just 
create SQL statements that -- when applied in reverse -- gradually roll 
back the database to any previous state).

        http://www.sqlite.org/cvstrac/wiki?p=UndoRedo

But I'm curious if there are any better ways to go about it.  I'm 
guessing this is a pretty well-explored area -- anybody who's gone ahead 
care to share their wisdom?

Thanks!

-david

_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Distributed transaction best practices

Reply via email to