I’ve just run headlong in to the issues described in "No Isolation Between
Operations On The Same Database Connection”. Specifically, I’ve discovered
(after some debugging) that if I iterate over the the rows in a table using
sqlite3_step, and update each row after it’s returned, Bad Stuff happens.
Specifically, my query is just getting the first row over and over and over
again, and the iteration runs forever. :(
I had been under the impression that, since I’m using the WAL, queries operate
on a snapshot of the database as of the time they begin, and are unaffected by
subsequent changes. I got this from reading about "snapshot isolation” in a
previous section of that document. (Also, another key/value database engine
I’ve used recently _does_ behave this way, so it’s what I was expecting.) I now
see that the “read transaction” described in that section has to be occurring
in a different connection than the write transaction. (Right?)
I’m unsure what to do now. I am working on a library whose API exposes iterator
objects that run queries; the iterator’s “next()” method internally calls
sqlite3_step. Thus the interleaving of the query and updating the database is
not under my control; it’s up to the developer using our library, and I do
_not_ want to expose inconvenient undefined behavior like this, or tell
developers that “you can’t modify the database while you’re iterating it”.
I can’t be the first person to run into this. Is there a best practice for
enabling concurrent iteration and mutation? I can think of two solutions:
A. Batch up all of the query results in memory at the start of the iteration,
and have the iterator just read them out of the in-memory list.
I’d like to avoid this because of the obvious memory overhead and
latency imposed on large queries. Version 1 of our library worked this way,
which is why I probably hadn’t noticed the problem until now.
B. Create a separate SQLite connection for the query; then it’ll be isolated
from any changes being made in the main connection.
This seems elegant, but it will of course use more memory for the extra
connection (with its own cache.) Moreover, it seems like I’ll need to open an
indefinite number of extra connections: if the caller starts a query, makes
some changes, and then starts another query (before reading the final row of
the first query), I need to open another connection for the second query
because it has to see the changes, which aren’t yet visible in the first
query's connection … right?
—Jens
[1]: https://www.sqlite.org/isolation.html
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users