Hello,




"D. Richard Hipp" <[EMAIL PROTECTED]>
23/11/2003 12:09 AM

 
        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        [sqlite] Improving concurrency in SQLite


> Feedback is requested on the paper found at:
>     http://www.sqlite.org/concurrency.html
> Your analysis and ideas will help to make SQLite a
> Better database engine.  Thanks.

I like the ideas in this document. I particularly like the deferred write 
locks by default of section 4.3, and the blocking locks of section 4.5. I 
don't like much the idea of deferred writes from section 4.4, nor do I 
intuitively like the ideas of a client-server model or a multi-file model 
for sqlite.

Some thoughts:

1)
For section 4.3 I wonder if you've considered the possibility of deferred 
read locks, also. Consider the following sequence:
BEGIN;
select * from table1;
# go away and do processing on results for a while
update table2 ...;
END;
If a fine-grained locking were to be introduced, locks would have to be 
placed on table1 immediately... but the locks on table2 might not be 
necessary until later. A writer may still be able to modify table2 while 
the reader is processing the results of table1.

2)
Fine-grained locking also has problems outside of the issues mentioned in 
your document. Consider the following transaction taking place at the same 
time as the one already mentioned (the selects are just re-ordered):
BEGIN;
select * from table2;
# go away and do processing on results for a while
update table1 ...;
END;
If both executed simultaneously each would have a read-lock either on the 
whole database or on the table that the other is just about to update. If 
blocking locks are used this leads to a deadlock. The first thread to do 
the update sleeps, and the next can't sleep because it will never wake up. 
It's still a problem with non-blocking locks, though. At the moment once 
you've begun a transaction you know you're not going to get locked out, 
but with these changes in programmers must be much more defensive and able 
to deal with locking problems potentially at any sql statement.
I think you should at least canvass these issues alongside your 
fine-grained locking discussions, and possibly suggest solutions. If 
table-based locking is used one solution in this case would be to always 
lock table1 before locking table2 whenever a lock for table2 was required. 
Because everyone locks their tables in the same order deadlock would be 
impossible. Of course, this can drastically reduce the concurency gains 
that might have otherwise been obtained. The only other approach I can 
think of is to parse the sql of the entire transaction to find all tables 
that might be affected and lock them in some standard order... or require 
the programmer to "forward-declare" the tables that are to be modified. 
This gets even stickier for row-level locking.

3)
I mentioned earlier that I don't like the idea of a client-server model 
for the database. I say this mainly due to what this would mean in terms 
of a security model. To introduce such a system would mean a completely 
different database than sqlite currently provides. You don't talk about 
this in your document, so I'll my comment on the subject at that ;)

4)
I mentioned earlier that I don't like the idea of multiple files for the 
database. In-particular, one of the touted advantages of sqlite over, say, 
embedded mysql is that it is a single file that can be copied around, 
emailed, and backed up. Well... that's not strictly true. There's the 
journal file.
This gets me to thinking... do we really need one that's separate to the 
main database, or could we create a journal using pages of the main 
database? I'm only thinking this through as I write it, so please excuse 
any vaugness... :) but I think you might be able to think of the 
transaction in a slightly different way if this were able to be done. Say 
you had three pages in your database:
(a),(b),(c)
Maybe you want to do a transaction that affects page b. You might be able 
to manage the log inside the same file:
(a),(b),(c),(b')
The original b page is still there... and still available for readers to 
do queries on. Even though the new b page, b', is in the database and 
being modified by the transaction concurrency is not lost.
Now it comes time to commit. The thread managing the transaction locks all 
other processes out of the database for this activity, then either 
redirects all b references to b', or moves the b' page over the top of b. 
The redirection might be the best way. Under this model you would probably 
virtualise the page list numbering so that the page "1" shown to the btree 
layer might be page 3 on disk. This would have to be done in a way that 
protects against process or machine crashes, but might be the only time 
during the transaction that it's really necessary to be that careful. I 
suppose you'd still have to have your dirty list, and may also require a 
pending transaction page list also which you'd need to manage carefully.
Anyway.
If you can get something like that working, perhaps you can impliment a 
page-level locking system where multiple transactions could be in play. 
Readers would be able to access all "oringial" pages while the writing is 
going on until each transactin is committed. Locks on the individual pages 
would not prevent reader access while the transactions are going on, but 
would prevent writers from modifying the same page at the same time. As 
with other fine-grained locking systems you'd need a way to prevent 
deadlock if multiple writers are permitted.
I guess if this kind of system could work I'd like to see it implimented. 
I would guess it would only really affect pager.c and wouldn't have a huge 
impact elsewhere. I'd also see it as an extensible model that wouldn't 
require more and more files as the database becomes more complicated.

Anyway :) My thoughts. Feel free to do what you like with them.

Benjamin.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to