"Deferred means that no locks are acquired on the database until the database is first accessed. Thus with a deferred transaction, the BEGIN statement itself does nothing to the filesystem. Locks are not acquired until the first read or write operation."
On Jul 18, 2017 6:10 AM, "Gwendal Roué" <gwendal.r...@gmail.com> wrote: > Hello all, > > The following sentence in https://www.sqlite.org/isolation.html does not > exactly describe the behavior of SQLite (since many versions): > > > In WAL mode, SQLite exhibits "snapshot isolation". When a read > transaction starts, that reader continues to see an unchanging "snapshot" > of the database file as it existed at the moment in time when the read > transaction started. Any write transactions that commit while the read > transaction is active are still invisible to the read transaction, because > the reader is seeing a snapshot of database file from a prior moment in > time. > > > I'll exhibit the inaccuracy below. > > Meanwhile, "snapshot isolation" is a tremendous feature of SQLite. Not > only does it grant any read-only connection that opens a deferred > transaction an immutable and consistent view of the database. But when you > can guarantee that there is a single writer connection, snapshot isolation > allows *precise scheduling*, such as blocking the writer connection until a > reader has established snapshot isolation. With such precision, one can > exactly control what's available to a reader, while not blocking the writer > longer than necessary. > > And this is where the documentation paragraph starts becoming inaccurate. > For the simplicity of the argument, I'll execute statements sequentially > from two connections W and R. To reproduce, just open two shells, and > execute statements in the following order: > > $ sqlite3 /tmp/snapshotisolation.sqlite > SQLite version 3.16.0 2016-11-04 19:09:39 > W> PRAGMA journal_mode=wal; > W> CREATE TABLE t(a); > R> BEGIN DEFERRED TRANSACTION; > W> INSERT INTO t DEFAULT VALUES; > R> SELECT COUNT(*) FROM t; > 1 > > This is unexpected. After connection R has started a deferred transaction, > it should continue to see an "unchanging snapshot of the database file as > it existed at the moment in time when the read transaction started". > Obviously, this is not the case, since the insertion performed by W is > visible from R even though it has been performed *after* R has started its > deferred transaction. The "Any write transactions that commit while the > read transaction is active are still invisible to the read transaction" is > also flat wrong here. > > If we continue, things behave as expected: > > W> INSERT INTO t DEFAULT VALUES; > R> SELECT COUNT(*) FROM t; > 1 > > R does not see the new insertion, which means that it indeed lives in an > unchanging snapshot. It just happens that the snapshot was not established > when the transaction has started, as documented, but *later*. But when? > > After a few experiments, it looks like the snapshot is established on the > first select: > > W> PRAGMA journal_mode=wal; > W> CREATE TABLE t(a); > R> BEGIN DEFERRED TRANSACTION; > R> SELECT * FROM sqlite_master LIMIT 1; -- good enough to start the > snapshot > W> INSERT INTO t DEFAULT VALUES; > R> SELECT COUNT(*) FROM t; > 0 -- as expected > > If now we stop entering commands by hand in the CLI, and start working > with threads, the *precise scheduling* I mention at the beginning of the > mail requires to block the writer connection W until the reader connection > R has started a deferred transaction, and sqlite3_step() has been any > executed once from any select statement. Only then can the writer > connection be released, with a absolute control of the content of the > reader's snapshot. > > I'm not sure the behavior I've just described can be called a bug. The > snapshot is unchanging indeed. When there are several processes connected > to the database, a reader can't know whether a change has been performed > before its deferred transaction has started, or between the start of the > deferred transaction and its first select statement, and therefore should > not care at all. With this interpretation, there is nothing wrong in the > current behavior of SQLite. > > However, when an application developper is sure that the database has a > single writer connection, the question of the content of the snapshot > suddenly becomes relevant. And the documentation becomes inaccurate. > > What do SQLite concurrency masters of this mailing list think? Besides > that threads are evil, I mean ;-) Shouldn't the documentation be updated? > > Cheers to all, > Gwendal > PS: For the record, I've been talking about "precise scheduling" and > "absolute control of the content of the snapshot" because they are the > necessary conditions for some database observation features such as > reloading fresh values from a request as soon as a transaction has modified > its content. See https://github.com/RxSwiftCommunity/RxGRDB for some > high-level implementations of such feature. > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users