[sqlite] Bitten by lack of isolation between SELECT and UPDATE on the same connection

Jens Alfke Mon, 30 Jan 2017 19:30:07 -0800

I’ve just run headlong in to the issues described in "No Isolation Between 
Operations On The Same Database Connection”. Specifically, I’ve discovered 
(after some debugging) that if I iterate over the the rows in a table using 
sqlite3_step, and update each row after it’s returned, Bad Stuff happens. 
Specifically, my query is just getting the first row over and over and over 
again, and the iteration runs forever. :(


I had been under the impression that, since I’m using the WAL, queries operate 
on a snapshot of the database as of the time they begin, and are unaffected by 
subsequent changes. I got this from reading about "snapshot isolation” in a 
previous section of that document. (Also, another key/value database engine 
I’ve used recently _does_ behave this way, so it’s what I was expecting.) I now 
see that the “read transaction” described in that section has to be occurring 
in a different connection than the write transaction. (Right?)

I’m unsure what to do now. I am working on a library whose API exposes iterator 
objects that run queries; the iterator’s “next()” method internally calls 
sqlite3_step. Thus the interleaving of the query and updating the database is 
not under my control; it’s up to the developer using our library, and I do 
_not_ want to expose inconvenient undefined behavior like this, or tell 
developers that “you can’t modify the database while you’re iterating it”.

I can’t be the first person to run into this. Is there a best practice for 
enabling concurrent iteration and mutation? I can think of two solutions:

A. Batch up all of the query results in memory at the start of the iteration, 
and have the iterator just read them out of the in-memory list.
        I’d like to avoid this because of the obvious memory overhead and 
latency imposed on large queries. Version 1 of our library worked this way, 
which is why I probably hadn’t noticed the problem until now.

B. Create a separate SQLite connection for the query; then it’ll be isolated 
from any changes being made in the main connection.
        This seems elegant, but it will of course use more memory for the extra 
connection (with its own cache.) Moreover, it seems like I’ll need to open an 
indefinite number of extra connections: if the caller starts a query, makes 
some changes, and then starts another query (before reading the final row of 
the first query), I need to open another connection for the second query 
because it has to see the changes, which aren’t yet visible in the first 
query's connection … right?

—Jens

[1]: https://www.sqlite.org/isolation.html
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Bitten by lack of isolation between SELECT and UPDATE on the same connection

Reply via email to