Hi! SQLite 3.2.1 on win32.
I'm having trouble wrapping my mind around where I have to be
paranoid about locking issues. In particular, the behavior when
readers and writers overlap, or two writers overlap, is not 100%
clear to me from an obsessive reading and rereading of the locking
documentation, google, the wiki and the mailing list archives.
So here is what I believe; if anything I say is inaccurate, I
would have undying appreciation for a correction.
Consider the schema:
create table employee (
id integer primary key,
name text,
phone text
);
create table department (
id integer primary key,
name text
);
create table emp_dep ( -- many to many relation
id integer primary key,
emp_id integer,
dep_id integer
);
create unique index emp_index on employee(name) on conflict replace;
create unique index dep_index on department(name) on conflict replace;
create unique index emp_dep_index on emp_dep(emp_id, dep_id)
on conflict replace;
The operations that will hit this schema in pseudocode:
1. the 'reader' operation (happens asynchronously, often)
open db
prepare the following sql statement
select e.name, d.name from employee e, department d, emp_dep ed
where e.id = ed.emp_id
and d.id = ed.dep_id
and e.name = '?'
for each employee I'm interested in (1..many)
execute
scroll through results
close the query
end for
close sql prepare
close db
2. the 'writer' operation (happens asynchronously, more rarely)
open db
prepare the following sql statements
A: insert into employee(name, phone) values (?, ?)
B: insert into department(name) values (?, ?)
C: insert into emp_dep(emp_id, dep_id) values (?, ?)
begin deferred transaction
for each employee I'm interested in (1..many)
execute A
save last autoincrement
execute B
save last autoincrement
execute C (using the two autoincrements)
end for
commit
close sql prepares
close db
Here is the reader operation again, this time marked up. ALL CAPS
means that something else could have a lock preventing the operation
from successfully completing.
OPEN DB -- a writer could be in the middle of writing
prepare the following sql statement -- only unsafe if schema changes
for each employee I'm interested in (1..many)
EXECUTE -- a writer could have jumped in
scroll (slowly) through the results -- uninterruptible
close the query
end for
close sql prepare
close db
Here is the writer operation, marked up with ALL CAPS for ops
that could fail due to some other writer or reader.
OPEN DB -- a writer could be operating
prepare the following sql statements -- only unsafe if schema changes
BEGIN DEFERRED TRANSACTION -- a writer or reader could cause BUSY
for each employee I'm interested in (1..many)
EXECUTE A -- the transaction could spill into EXCLUSIVE & BUSY
-- thus conflicting with an extant SHARED
SAVE LAST AUTOINCREMENT -- as above
EXECUTE B -- as above
SAVE LAST AUTOINCREMENT -- as above
EXECUTE C -- as above
end for
COMMIT -- writer or reader could cause busy
close sql prepares
close db
In the comment for 'execute A' above, it seems to me that the pager could
conceivably try to force the transaction into EXCLUSIVE mode at any time
(dependent on memory settings etc. outside the control of the API).
So every call to execute a SQL statement could get the BUSY treatment,
and thus must be explicitly wrapped in retry handling code.
This further suggests that a naive approach to executing sql -- putting
multiple sql statements in one string and passing it to exec -- is
unsafe at any speed; consider
begin deferred transaction;
update...;
update...;
update...;
commit;
if one were to pass that valid SQL to exec, and then get a BUSY back,
there is no recovery possible. The pager could have decided to force
into exclusive (and thus BUSY) for memory cache spill reasons at any
time (e.g., after the 1st, or 3rd update), OR it could be because of
the commit. Unless somehow transactions always stay in the memory
cache, which doesn't seem from the documentation to be the case.
In the unlikely event that Dr. Hipp has read this far, it would be
a welcome addition to the documentation to indicate which API functions
may throw exceptions, and which exceptions, in a multi-user scenario.
Additionally, orthogonal to the question of row level locking, it would
be nice to have a set of synchronous APIs that return BUSY only after
a per-connection configurable delay. And I would also like a pony.
Felix