Re: [sqlite] Conflict between snapshots and checkpoints
On Mon, Sep 30, 2019 at 2:07 PM Keith Medcalf wrote: > On Monday, 30 September, 2019 02:06, Dominique Devienne < > ddevie...@gmail.com> wrote: > >On Sun, Sep 29, 2019 at 2:13 PM Keith Medcalf > wrote: > >> On Sunday, 29 September, 2019 01:28, Gwendal Roué < > gwendal.r...@gmail.com> wrote: > > >> >Those N reader connections allow concurrent database reads. Those > >> "reads" are generally wrapped in a deferred transaction which provides > >> snapshot isolation. > > >> No, it provides REPEATABLE-READ isolation. There is no actual > >> "snapshot" taken and no snapshot exists. > > > You are merely not seeing data written to the WAL transaction log at a > > point-in-time subsequent to the point in time at which you commenced > > the "repeatable-read". > > >I don't see where you are going with this Keith. > >Repeatable-reads "in the past" *are* snapshots IMHO. > > Generally no. "Snapshot Isolation" does not really exist, though some > RDBMS have created it to permit more opportunistic updates. Reads are at > the Repeatable-Read isolation level (both for WAL and DELETE journal > modes), and writes are Serialized. Snapshot Isolation is an invention of > the MVCC folks to theoretically permit greater update concurrency at the > expense of serializable isolation and introduces anomalies into the > database read and write processing that cannot occur when the updates are > serialized, and which generally requires the application programmer to take > extra steps to ensure database consistency. > > >The WAL file *does* contain enough information combined with the main > >database file pages, to logically *and* physically represent a "snapshot" > >of the DB at that point-in-time. > > For the purposes of reading only yes, I suppose you could call it a > "snapshot", except that it isn't. It is just a point-in-time > repeatable-read. You can only upgrade a transaction from read to write if > you are holding the "top" snapshot (that is, you must be seeing the entire > database, not a point-in-time version of it). > > >So not calling it a "snapshot" is a stretch at the very least. What is a > "snapshot" > >according to you, if that's not it? > > Snapshot Isolation is implemented by a bunch of different databases that > do not conform to the SQL Standard and it introduces anomalies into the > update process that cannot be introduced when using Serializable > Isolation. > > https://en.wikipedia.org/wiki/Isolation_(database_systems) > https://en.wikipedia.org/wiki/Snapshot_isolation > > >And also why do you think Richard, who knows a thing or two about > >databases, called these API *snaphot* then? > > Because it is a convenient descriptor, perhaps? They may in fact be > considered to be a snapshot of the database as it existed at some point in > the past (without full view of all committed transactions) however only the > "top" snapshot, the one that has a view of all committed transactions is > permitted to update/write to the database. > > >I'm genuinely curious here. I think I disagree with you, but most time I > >do, I'm wrong, so I'd like to understand, really. --DD > > The isolation is either repeatable-read for read transactions, or > serializable for writes. It is not Snapshot Isolation. > So although one may consider that what you are looking at is a "snapshot" > of the database that existed at a particular point-in-time, it should not > be confused with "snapshot isolation" which is an entirely different beast > altogether. > So I guess our main difference here, is that I have no qualms at all with point-in-time *read-only* repeatable-read transaction being called a "Snapshot". MVCC (which to me is synonymous with snapshots) is more about read-consistency across statements (i.e. a read transaction) that does *not* prevent writes. Using snapshots, I can parallelize access to several tables across connections (and threads), ensuring read-consistency in several separate transactions and connections. I used to do that in Oracle, and I'm glad that I can try to do it in SQLite too now. There are caveats of course, like controlling checkpointing, but I can live with that. --DD PS: Note that I never used "Snapshot *Isolation*" myself. For me, Snapshot = point-in-time read-consistency. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
On Monday, 30 September, 2019 02:06, Dominique Devienne wrote: >On Sun, Sep 29, 2019 at 2:13 PM Keith Medcalf wrote: >> On Sunday, 29 September, 2019 01:28, Gwendal Roué >> wrote: >> >Those N reader connections allow concurrent database reads. Those >> "reads" are generally wrapped in a deferred transaction which provides >> snapshot isolation. >> No, it provides REPEATABLE-READ isolation. There is no actual >> "snapshot" taken and no snapshot exists. > You are merely not seeing data written to the WAL transaction log at a > point-in-time subsequent to the point in time at which you commenced > the "repeatable-read". >I don't see where you are going with this Keith. >Repeatable-reads "in the past" *are* snapshots IMHO. Generally no. "Snapshot Isolation" does not really exist, though some RDBMS have created it to permit more opportunistic updates. Reads are at the Repeatable-Read isolation level (both for WAL and DELETE journal modes), and writes are Serialized. Snapshot Isolation is an invention of the MVCC folks to theoretically permit greater update concurrency at the expense of serializable isolation and introduces anomalies into the database read and write processing that cannot occur when the updates are serialized, and which generally requires the application programmer to take extra steps to ensure database consistency. >The WAL file *does* contain enough information combined with the main >database file pages, to logically *and* physically represent a "snapshot" >of the DB at that point-in-time. For the purposes of reading only yes, I suppose you could call it a "snapshot", except that it isn't. It is just a point-in-time repeatable-read. You can only upgrade a transaction from read to write if you are holding the "top" snapshot (that is, you must be seeing the entire database, not a point-in-time version of it). >So not calling it a "snapshot" is a stretch at the very least. What is a >"snapshot" >according to you, if that's not it? Snapshot Isolation is implemented by a bunch of different databases that do not conform to the SQL Standard and it introduces anomalies into the update process that cannot be introduced when using Serializable Isolation. https://en.wikipedia.org/wiki/Isolation_(database_systems) https://en.wikipedia.org/wiki/Snapshot_isolation >And also why do you think Richard, who knows a thing or two about >databases, called these API *snaphot* then? Because it is a convenient descriptor, perhaps? They may in fact be considered to be a snapshot of the database as it existed at some point in the past (without full view of all committed transactions) however only the "top" snapshot, the one that has a view of all committed transactions is permitted to update/write to the database. >I'm genuinely curious here. I think I disagree with you, but most time I >do, I'm wrong, so I'd like to understand, really. --DD The isolation is either repeatable-read for read transactions, or serializable for writes. It is not Snapshot Isolation. So although one may consider that what you are looking at is a "snapshot" of the database that existed at a particular point-in-time, it should not be confused with "snapshot isolation" which is an entirely different beast altogether. -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
On 9/30/19 4:28 AM, Gwendal Roué wrote: > According to > http://dbmsmusings.blogspot.com/2019/06/correctness-anomalies-under.html, > SNAPSHOT ISOLATION is stronger than REPEATABLE READ, in that it prevents > "phantom reads" ( > http://dbmsmusings.blogspot.com/2019/05/introduction-to-transaction-isolation.html). > I think SQLite prevents phantom reads, and so we actually get actual > SNAPSHOT ISOLATION (as written in https://www.sqlite.org/isolation.html). Based on a bit of research with things like https://en.wikipedia.org/wiki/Snapshot_isolation Snapshot Isolation appears to be a Term of Art, not a Standardized Term so it can be forgiven if it isn't used in the documentation. Also, it seems to imply non-serialized writes, which SQLite does NOT provide, so isn't even really applicable. -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
According to http://dbmsmusings.blogspot.com/2019/06/correctness-anomalies-under.html, SNAPSHOT ISOLATION is stronger than REPEATABLE READ, in that it prevents "phantom reads" ( http://dbmsmusings.blogspot.com/2019/05/introduction-to-transaction-isolation.html). I think SQLite prevents phantom reads, and so we actually get actual SNAPSHOT ISOLATION (as written in https://www.sqlite.org/isolation.html). On Mon, Sep 30, 2019 at 10:06 AM Dominique Devienne wrote: > On Sun, Sep 29, 2019 at 2:13 PM Keith Medcalf wrote: > > > On Sunday, 29 September, 2019 01:28, Gwendal Roué < > gwendal.r...@gmail.com> > > wrote: > > >Those N reader connections allow concurrent database reads. Those > "reads" > > are > > >generally wrapped in a deferred transaction which provides snapshot > > >isolation. > > > > No, it provides REPEATABLE-READ isolation. There is no actual "snapshot" > > taken and no snapshot exists. > > You are merely not seeing data written to the WAL transaction log at a > > point-in-time subsequent to the point in time at which you commenced the > > "repeatable-read". > > > > I don't see where you are going with this Keith. > Repeatable-reads "in the past" *are* snapshots IMHO. > > The WAL file *does* contain enough information combined with the main > database file pages, > to logically *and* physically represent a "snapshot" of the DB at that > point-in-time. So not calling > it a "snapshot" is a stretch at the very least. What is a "snapshot" > according to you, if that's not it? > > And also why do you think Richard, who knows a thing or two about > databases, called these API *snaphot* then? > > I'm genuinely curious here. I think I disagree with you, but most time I > do, I'm wrong, so I'd like to understand, really. --DD > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
On Sun, Sep 29, 2019 at 2:13 PM Keith Medcalf wrote: > On Sunday, 29 September, 2019 01:28, Gwendal Roué > wrote: > >Those N reader connections allow concurrent database reads. Those "reads" > are > >generally wrapped in a deferred transaction which provides snapshot > >isolation. > > No, it provides REPEATABLE-READ isolation. There is no actual "snapshot" > taken and no snapshot exists. You are merely not seeing data written to the WAL transaction log at a > point-in-time subsequent to the point in time at which you commenced the > "repeatable-read". > I don't see where you are going with this Keith. Repeatable-reads "in the past" *are* snapshots IMHO. The WAL file *does* contain enough information combined with the main database file pages, to logically *and* physically represent a "snapshot" of the DB at that point-in-time. So not calling it a "snapshot" is a stretch at the very least. What is a "snapshot" according to you, if that's not it? And also why do you think Richard, who knows a thing or two about databases, called these API *snaphot* then? I'm genuinely curious here. I think I disagree with you, but most time I do, I'm wrong, so I'd like to understand, really. --DD ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
Yes, this is exactly as you say. Isolation is very important, and I carefully make sure that it is guaranteed to users who need it. This has been working well for a few years now, GRDB.swift is very robust. My recent focus was on SQLite snapshots, and you and other members of this mailing list have been very helpful by completing the existing documentation :-) Gwendal On Sun, Sep 29, 2019 at 7:41 PM Richard Damon wrote: > On 9/29/19 11:40 AM, Gwendal Roué wrote: > > Thank you very much Keith. > > > > Apologies for my imprecise vocabulary, and the use of the same "snapshot" > > word with different meanings. > > > > I have used the term "snapshot isolation" as used in > > https://www.sqlite.org/isolation.html; But I'll remember about > > REPEATABLE-READ isolation. > > > > I also thank you very much for confirming that preventing checkpoints > > allows the WAL markers to remain valid. > > > > My goal is to provide a set of concurrency primitives that are useful for > > GUI app developers. I'm learning a lot about SQLite in the process, of > > course. > > And since you reused the connection (otherwise you just need to keep the > transaction open and the snapshot was preserved) you don't have isolation. > > Quote: > > > No Isolation Between Operations On The Same Database Connection > > SQLite provides isolation between operations in separate database > connections. However, there is no isolation between operations that > occur within the same database connection. > > > > -- > Richard Damon > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
On 9/29/19 11:40 AM, Gwendal Roué wrote: > Thank you very much Keith. > > Apologies for my imprecise vocabulary, and the use of the same "snapshot" > word with different meanings. > > I have used the term "snapshot isolation" as used in > https://www.sqlite.org/isolation.html; But I'll remember about > REPEATABLE-READ isolation. > > I also thank you very much for confirming that preventing checkpoints > allows the WAL markers to remain valid. > > My goal is to provide a set of concurrency primitives that are useful for > GUI app developers. I'm learning a lot about SQLite in the process, of > course. And since you reused the connection (otherwise you just need to keep the transaction open and the snapshot was preserved) you don't have isolation. Quote: No Isolation Between Operations On The Same Database Connection SQLite provides isolation between operations in separate database connections. However, there is no isolation between operations that occur within the same database connection. -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
Thank you very much Keith. Apologies for my imprecise vocabulary, and the use of the same "snapshot" word with different meanings. I have used the term "snapshot isolation" as used in https://www.sqlite.org/isolation.html; But I'll remember about REPEATABLE-READ isolation. I also thank you very much for confirming that preventing checkpoints allows the WAL markers to remain valid. My goal is to provide a set of concurrency primitives that are useful for GUI app developers. I'm learning a lot about SQLite in the process, of course. On Sun, Sep 29, 2019 at 2:13 PM Keith Medcalf wrote: > > On Sunday, 29 September, 2019 01:28, Gwendal Roué > wrote: > > >But now I fail to understand the indented use case of sqlite3 snapshots.. > >Why allow to reuse snapshots with several calls to open()? Why do they > >exist at all, since we can already profit from snapshot isolation with > >one transaction (at the cost of keeping it open)? > > The sqlite3_snapshot_* interfaces allow you to record information about > transactions maintained by the WAL transaction manager. The interface DOES > NOT manage the WAL transaction manager nor does it manage your so-called > snapshots (which are not snapshots). It allows you to "get" information > about a particular point-in-time location in the WAL transaction log and > "goto" a specific point-in-time in the WAL transaction log, to "free" the > memory used to record this information, and to "cmp" which of two > structures point to a later point-in-time in the WAL transaction log. > > They do NOT modify or control the WAL transaction system or the WAL file. > You are responsible for doing this. > > https://sqlite.org/wal.html > > So, *IF* you want to maintain the availability of a specific point in time > in the WAL transaction history, they YOU must make sure that YOU do not > erase that particular point-in-time location by checkpointing the WAL > transaction log. You have the capability of doing this by disabling the > autocheckpoint mechanism and controlling when you do checkpoints. The WAL > transaction system neither knows nor cares that you happen to have recorded > some information about a particular point-in-time offset in the WAL file. > > It is sort of like "No Smoking" signs. The sign does not mean that you > cannot smoke. It means that someone posted a sign that says "No Smoking". > The fact that there is a sign and what it says is an entirely separate fact > that has no bearing at all on whether or not one may smoke, nor does the > sign somehow magically prevent smoking, nor will it prevent you from > catching on fire and smoking as a result thereof. It is merely a sign that > has something written on it which someone stuck up on the wall (or whatever > support structure it is stuck to). And what is written on the sign is > entirely independent of any other state of affairs. > > >For information, my quest for snapshot protection has met some success ( > >https://github.com/groue/GRDB.swift/pull/625) > > This appears to be updating a wrapper interface to SQLite3 so that if you > happen to have a point-in-time location recorded, then the automatic WAL > checkpoint system is disabled, plus doing some prevention to protect users > of the wrapper from modifying (checkpointing) the WAL transaction log if > they happen to have some recorded "points-in-time" that are dependent on > not modifying the WAL transaction log. (ie, things that you should be > doing yourself so as to not shoot yourself in the foot). > > >Given I control the connections to a given database file, I am able to > >open one writer connection, and, on demand, up to N read-only connections. > > >Those N reader connections allow concurrent database reads. Those "reads" > are > >generally wrapped in a deferred transaction which provides snapshot > >isolation. > > No, it provides REPEATABLE-READ isolation. There is no actual "snapshot" > taken and no snapshot exists. You are merely not seeing data written to > the WAL transaction log at a point-in-time subsequent to the point in time > at which you commenced the "repeatable-read". > > The sqlite3_snapshot_* interface merely provides a way to record > information about this repeatable-read point-in-time position in the WAL > file so that you can go back to that "repeatable-read" point-in-time > location sometime in the future IF IT STILL EXISTS in the WAL transaction > log. Since YOU are in control of whether or not the WAL file is > checkpointed, then YOU are in control of whether this point-in-time still > exists in the WAL transaction log. If you remove data from the WAL > transaction log by performing a checkpoint (which moves transactions into > the main database file) then this recorded "point-in-time" may no longer > exist in the WAL transaction log. > > >At the end of this transaction, the reader connection becomes > >available for another read. > > Yes. And by default when the transaction starts it gets a repeatable-read > isolation
Re: [sqlite] Conflict between snapshots and checkpoints
On Sunday, 29 September, 2019 01:28, Gwendal Roué wrote: >But now I fail to understand the indented use case of sqlite3 snapshots.. >Why allow to reuse snapshots with several calls to open()? Why do they >exist at all, since we can already profit from snapshot isolation with >one transaction (at the cost of keeping it open)? The sqlite3_snapshot_* interfaces allow you to record information about transactions maintained by the WAL transaction manager. The interface DOES NOT manage the WAL transaction manager nor does it manage your so-called snapshots (which are not snapshots). It allows you to "get" information about a particular point-in-time location in the WAL transaction log and "goto" a specific point-in-time in the WAL transaction log, to "free" the memory used to record this information, and to "cmp" which of two structures point to a later point-in-time in the WAL transaction log. They do NOT modify or control the WAL transaction system or the WAL file. You are responsible for doing this. https://sqlite.org/wal.html So, *IF* you want to maintain the availability of a specific point in time in the WAL transaction history, they YOU must make sure that YOU do not erase that particular point-in-time location by checkpointing the WAL transaction log. You have the capability of doing this by disabling the autocheckpoint mechanism and controlling when you do checkpoints. The WAL transaction system neither knows nor cares that you happen to have recorded some information about a particular point-in-time offset in the WAL file. It is sort of like "No Smoking" signs. The sign does not mean that you cannot smoke. It means that someone posted a sign that says "No Smoking". The fact that there is a sign and what it says is an entirely separate fact that has no bearing at all on whether or not one may smoke, nor does the sign somehow magically prevent smoking, nor will it prevent you from catching on fire and smoking as a result thereof. It is merely a sign that has something written on it which someone stuck up on the wall (or whatever support structure it is stuck to). And what is written on the sign is entirely independent of any other state of affairs. >For information, my quest for snapshot protection has met some success ( >https://github.com/groue/GRDB.swift/pull/625) This appears to be updating a wrapper interface to SQLite3 so that if you happen to have a point-in-time location recorded, then the automatic WAL checkpoint system is disabled, plus doing some prevention to protect users of the wrapper from modifying (checkpointing) the WAL transaction log if they happen to have some recorded "points-in-time" that are dependent on not modifying the WAL transaction log. (ie, things that you should be doing yourself so as to not shoot yourself in the foot). >Given I control the connections to a given database file, I am able to >open one writer connection, and, on demand, up to N read-only connections. >Those N reader connections allow concurrent database reads. Those "reads" are >generally wrapped in a deferred transaction which provides snapshot >isolation. No, it provides REPEATABLE-READ isolation. There is no actual "snapshot" taken and no snapshot exists. You are merely not seeing data written to the WAL transaction log at a point-in-time subsequent to the point in time at which you commenced the "repeatable-read". The sqlite3_snapshot_* interface merely provides a way to record information about this repeatable-read point-in-time position in the WAL file so that you can go back to that "repeatable-read" point-in-time location sometime in the future IF IT STILL EXISTS in the WAL transaction log. Since YOU are in control of whether or not the WAL file is checkpointed, then YOU are in control of whether this point-in-time still exists in the WAL transaction log. If you remove data from the WAL transaction log by performing a checkpoint (which moves transactions into the main database file) then this recorded "point-in-time" may no longer exist in the WAL transaction log. >At the end of this transaction, the reader connection becomes >available for another read. Yes. And by default when the transaction starts it gets a repeatable-read isolation "point-in-time" stamp corresponding to the last transaction written to the WAL transaction log which has been committed, or gets the current point-in-time representing the "current database" if there are no committed transactions in the WAL transaction log. >One can now create a "snapshot". Those snapshots use the same pool of N >readers: snapshot reads are then wrapped in a deferred transaction and >sqlite3_snapshot_open. At the end of this transaction, the reader >connection becomes available for another regular read or for another >snapshot read. There is the error. There is no "snapshot". You do not "create a snapshot". You record a point-in-time marker location within the WAL
Re: [sqlite] Conflict between snapshots and checkpoints
On 9/29/19 3:27 AM, Gwendal Roué wrote: > Thank you, Richard > > But now I fail to understand the indented use case of sqlite3 snapshots.. > Why allow to reuse snapshots with several calls to open()? Why do they > exist at all, since we can already profit from snapshot isolation with one > transaction (at the cost of keeping it open)? > > For information, my quest for snapshot protection has met some success ( > https://github.com/groue/GRDB.swift/pull/625) > > Given I control the connections to a given database file, I am able to open > one writer connection, and, on demand, up to N read-only connections. Those > N reader connections allow concurrent database reads. Those "reads" are > generally wrapped in a deferred transaction which provides snapshot > isolation. At the end of this transaction, the reader connection becomes > available for another read. > > One can now create a "snapshot". Those snapshots use the same pool of N > readers: snapshot reads are then wrapped in a deferred transaction and > sqlite3_snapshot_open. At the end of this transaction, the reader > connection becomes available for another regular read or for another > snapshot read. > > In order to protect unused snapshots, checkpoints are prevented as long as > there exists snapshots, with sqlite3_wal_hook(). > > I **really** hope this protects snapshots for good. Of course, a > confirmation from knowledgeable people would be appreciated :-) > > Gwendal I won;t say I know why they were put in, but here is one possibility that I can think of. First, your model assumes that you have total control and confidence over the entire application (and if you do, then you can protect the snapshots yourself). By passing the snapshot and not the connection with transaction with the snapshot, you can be sure that the receiver can't disturb the session and invalidate the snapshot. By opening the snapshot, the receiver is able to recreate the database at that point, to read its state, but can't accidentally invalidate it. -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
Thank you, Richard But now I fail to understand the indented use case of sqlite3 snapshots.. Why allow to reuse snapshots with several calls to open()? Why do they exist at all, since we can already profit from snapshot isolation with one transaction (at the cost of keeping it open)? For information, my quest for snapshot protection has met some success ( https://github.com/groue/GRDB.swift/pull/625) Given I control the connections to a given database file, I am able to open one writer connection, and, on demand, up to N read-only connections. Those N reader connections allow concurrent database reads. Those "reads" are generally wrapped in a deferred transaction which provides snapshot isolation. At the end of this transaction, the reader connection becomes available for another read. One can now create a "snapshot". Those snapshots use the same pool of N readers: snapshot reads are then wrapped in a deferred transaction and sqlite3_snapshot_open. At the end of this transaction, the reader connection becomes available for another regular read or for another snapshot read. In order to protect unused snapshots, checkpoints are prevented as long as there exists snapshots, with sqlite3_wal_hook(). I **really** hope this protects snapshots for good. Of course, a confirmation from knowledgeable people would be appreciated :-) Gwendal On Sat, Sep 28, 2019 at 4:06 PM Richard Damon wrote: > On 9/26/19 12:13 PM, Gwendal Roué wrote: > > Hello, > > > > The documentation for sqlite3_snapshot_open() [1] says: > > > >> A call to sqlite3_snapshot_open() will fail to open if the specified > > snapshot has been overwritten by a checkpoint. > > > > And indeed I am able to create a snapshot with sqlite3_snapshot_get() > [2], > > then run checkpoints with sqlite3_wal_checkpoint_v2() [3], and finally > have > > sqlite3_snapshot_open() fail. > > > > I have a concern, because this turns automatic checkpointing into a > mortal > > danger for all snapshots. As soon as a checkpoint is performed, snapshots > > are invalidated, and whoever is relying on them has lost the database > state > > he's interested into. > > > > My question is: is it possible to prevent checkpoints from completing > > successfully when a snapshot is alive? > > > > I know this is possible with a "hand-made checkpoint", made of a distinct > > connection which has started a deferred transaction and has performed an > > initial read. Such hand-made checkpoint has SQLITE_CHECKPOINT_RESTART and > > SQLITE_CHECKPOINT_TRUNCATE fail, and are still able to access their > > precious database state after a SQLITE_CHECKPOINT_PASSIVE or a > > SQLITE_CHECKPOINT_FULL. > > > > The behavior of those "hand-made checkpoint" matches well the needs of > > users who want to use WAL without thinking too much about it: they do not > > disable automatic checkpointing, and are guaranteed with a stable access > to > > a given database state as long as they need it. > > > > I was wondering if such a behavior is possible with snapshots returned > from > > sqlite3_snapshot_get(). > > > > If it is not, then I humbly suggest that this feature would be added, and > > am willing to listen to the opinion of SQLite experts on this subject. > > > > Regards, > > Gwendal Roué > > > > [1] https://www.sqlite.org/c3ref/snapshot_open.html > > [2] https://www.sqlite.org/c3ref/snapshot_get.html > > [3] https://www.sqlite.org/c3ref/wal_checkpoint_v2.html > > I think you have a fundamental issue here. Snapshots are tied to a > Transaction, and as long as that transaction doesn't invalidate that > snapshot, nothing else can. (This is what you hand-mand checkpoint does > right). The issue is that if you are reusing the connection, you are > sharing the Transaction and thus not protecting your snapshot. Once > someone else using that connection breaks the transaction, then the > snapshot is no longer protected. > > The issue trying to increase the protection on a snapshot is that > currently sqlite_snapshot_free() isn't defined as protecting the > snapshot, but is merely a memory management tool, so many applications > likely don't free all their snapshots before they allow for the > possibility of them being invalidated by ending the Transaction, thus > this change would break many existing programs. I suppose it could be > implemented with a pragma or the like that doesn't allow that > transaction to be committed or invalidate the snapshot until the > snapshot is freed. > > What that would ultimately do is cause your program to get an error from > SQLite3 farther down the road when you did some action on the connection > that would potentially invalidate your precious snapshot, and you now > have one more source of 'unexplained' errors returns. > > -- > Richard Damon > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >
Re: [sqlite] Conflict between snapshots and checkpoints
On 9/26/19 12:13 PM, Gwendal Roué wrote: > Hello, > > The documentation for sqlite3_snapshot_open() [1] says: > >> A call to sqlite3_snapshot_open() will fail to open if the specified > snapshot has been overwritten by a checkpoint. > > And indeed I am able to create a snapshot with sqlite3_snapshot_get() [2], > then run checkpoints with sqlite3_wal_checkpoint_v2() [3], and finally have > sqlite3_snapshot_open() fail. > > I have a concern, because this turns automatic checkpointing into a mortal > danger for all snapshots. As soon as a checkpoint is performed, snapshots > are invalidated, and whoever is relying on them has lost the database state > he's interested into. > > My question is: is it possible to prevent checkpoints from completing > successfully when a snapshot is alive? > > I know this is possible with a "hand-made checkpoint", made of a distinct > connection which has started a deferred transaction and has performed an > initial read. Such hand-made checkpoint has SQLITE_CHECKPOINT_RESTART and > SQLITE_CHECKPOINT_TRUNCATE fail, and are still able to access their > precious database state after a SQLITE_CHECKPOINT_PASSIVE or a > SQLITE_CHECKPOINT_FULL. > > The behavior of those "hand-made checkpoint" matches well the needs of > users who want to use WAL without thinking too much about it: they do not > disable automatic checkpointing, and are guaranteed with a stable access to > a given database state as long as they need it. > > I was wondering if such a behavior is possible with snapshots returned from > sqlite3_snapshot_get(). > > If it is not, then I humbly suggest that this feature would be added, and > am willing to listen to the opinion of SQLite experts on this subject. > > Regards, > Gwendal Roué > > [1] https://www.sqlite.org/c3ref/snapshot_open.html > [2] https://www.sqlite.org/c3ref/snapshot_get.html > [3] https://www.sqlite.org/c3ref/wal_checkpoint_v2.html I think you have a fundamental issue here. Snapshots are tied to a Transaction, and as long as that transaction doesn't invalidate that snapshot, nothing else can. (This is what you hand-mand checkpoint does right). The issue is that if you are reusing the connection, you are sharing the Transaction and thus not protecting your snapshot. Once someone else using that connection breaks the transaction, then the snapshot is no longer protected. The issue trying to increase the protection on a snapshot is that currently sqlite_snapshot_free() isn't defined as protecting the snapshot, but is merely a memory management tool, so many applications likely don't free all their snapshots before they allow for the possibility of them being invalidated by ending the Transaction, thus this change would break many existing programs. I suppose it could be implemented with a pragma or the like that doesn't allow that transaction to be committed or invalidate the snapshot until the snapshot is freed. What that would ultimately do is cause your program to get an error from SQLite3 farther down the road when you did some action on the connection that would potentially invalidate your precious snapshot, and you now have one more source of 'unexplained' errors returns. -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
Richard, Further experiments with snapshots reveal a behavior I did not expect. My high-level goal is to *protect snapshots*: if a user of my SQLite wrapper GRDB.swift takes a snapshot, I want this snapshot to remain valid (usable without error) for the whole duration of its lifetime. During its lifetime, a snapshot will open transactions when it is used. But there are moments when a snapshot exists (has not been freed by sqlite3_snapshot_free()), but is not currently used by a transaction. I thus have to carefully manage checkpoints, so that they do not invalidate existing snapshots, even if they are not currently used in a transaction. I did attempt at protecting existing snapshots by opening a transaction and calling sqlite3_snapshot_open with the oldest open snapshot before calling sqlite3_wal_checkpoint_v2(). But this does not prevent snapshots from invalidation: 1. Create a snapshot with sqlite3_snapshot_get() 2. In connection A, open a deferred transaction, open the snapshot 3. In connection B, run the checkpoint (this fails with SQLITE_BUSY when checkpoint is RESTART OR TRUNCATE, as expected) 4. Close the transaction in connection A 5. In connection A, open a deferred transaction, open the snapshot, read, commit: success 6. In connection B, perform a write, commit: success 7. In connection A, open a deferred transaction, open the snapshot: SQLITE_ERROR (1), "not an error" Is the error at step 7 expected? On Fri, Sep 27, 2019 at 12:35 PM Richard Hipp wrote: > On 9/26/19, Gwendal Roué wrote: > > > > My question is: is it possible to prevent checkpoints from completing > > successfully when a snapshot is alive? > > > > That depends on what you mean by "alive"? > > An sqlite3_snapshot_get() simply records some numbers in private > memory of the database connection that called sqlite3_snapshot_get(). > There is no way for another process to know that those numbers have > been recorded, and hence no way to know that the snapshot exists, and > no way to prevent a checkpoint from happening. > > On the other hand, if you have run sqlite3_snapshot_open() so that > there is a transaction open on the snapshot, that takes locks on the > database which prevent checkpoints from running to completion and > erasing the snapshot. > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
>> My question is: is it possible to prevent checkpoints from completing >> successfully when a snapshot is alive? > > That depends on what you mean by "alive"? Sorry, I meant "not finalized by sqlite3_snapshot_free". Your precise answer lifts any doubt: only transactions can prevent checkpoints from running to completion. Thank you very much Richard! Gwendal On Fri, Sep 27, 2019 at 12:35 PM Richard Hipp wrote: > On 9/26/19, Gwendal Roué wrote: > > > > My question is: is it possible to prevent checkpoints from completing > > successfully when a snapshot is alive? > > > > That depends on what you mean by "alive"? > > An sqlite3_snapshot_get() simply records some numbers in private > memory of the database connection that called sqlite3_snapshot_get(). > There is no way for another process to know that those numbers have > been recorded, and hence no way to know that the snapshot exists, and > no way to prevent a checkpoint from happening. > > On the other hand, if you have run sqlite3_snapshot_open() so that > there is a transaction open on the snapshot, that takes locks on the > database which prevent checkpoints from running to completion and > erasing the snapshot. > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
On 9/26/19, Gwendal Roué wrote: > > My question is: is it possible to prevent checkpoints from completing > successfully when a snapshot is alive? > That depends on what you mean by "alive"? An sqlite3_snapshot_get() simply records some numbers in private memory of the database connection that called sqlite3_snapshot_get(). There is no way for another process to know that those numbers have been recorded, and hence no way to know that the snapshot exists, and no way to prevent a checkpoint from happening. On the other hand, if you have run sqlite3_snapshot_open() so that there is a transaction open on the snapshot, that takes locks on the database which prevent checkpoints from running to completion and erasing the snapshot. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Conflict between snapshots and checkpoints
> I have a concern, because this turns automatic checkpointing into a mortal danger for all snapshots. This sentence may be false. Automatic snapshots are PASSIVE, and I'm not sure passive checkpoints invalidate snapshots returned by sqlite3_snapshot_get(). But i'm not sure, and I would appreciate a confirmation! Thanks is advance, Gwendal Roué On Thu, Sep 26, 2019 at 6:13 PM Gwendal Roué wrote: > Hello, > > The documentation for sqlite3_snapshot_open() [1] says: > > > A call to sqlite3_snapshot_open() will fail to open if the specified > snapshot has been overwritten by a checkpoint. > > And indeed I am able to create a snapshot with sqlite3_snapshot_get() [2], > then run checkpoints with sqlite3_wal_checkpoint_v2() [3], and finally have > sqlite3_snapshot_open() fail. > > I have a concern, because this turns automatic checkpointing into a mortal > danger for all snapshots. As soon as a checkpoint is performed, snapshots > are invalidated, and whoever is relying on them has lost the database state > he's interested into. > > My question is: is it possible to prevent checkpoints from completing > successfully when a snapshot is alive? > > I know this is possible with a "hand-made checkpoint", made of a distinct > connection which has started a deferred transaction and has performed an > initial read. Such hand-made checkpoint has SQLITE_CHECKPOINT_RESTART and > SQLITE_CHECKPOINT_TRUNCATE fail, and are still able to access their > precious database state after a SQLITE_CHECKPOINT_PASSIVE or a > SQLITE_CHECKPOINT_FULL. > > The behavior of those "hand-made checkpoint" matches well the needs of > users who want to use WAL without thinking too much about it: they do not > disable automatic checkpointing, and are guaranteed with a stable access to > a given database state as long as they need it. > > I was wondering if such a behavior is possible with snapshots returned > from sqlite3_snapshot_get(). > > If it is not, then I humbly suggest that this feature would be added, and > am willing to listen to the opinion of SQLite experts on this subject. > > Regards, > Gwendal Roué > > [1] https://www.sqlite.org/c3ref/snapshot_open.html > [2] https://www.sqlite.org/c3ref/snapshot_get.html > [3] https://www.sqlite.org/c3ref/wal_checkpoint_v2.html > > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Conflict between snapshots and checkpoints
Hello, The documentation for sqlite3_snapshot_open() [1] says: > A call to sqlite3_snapshot_open() will fail to open if the specified snapshot has been overwritten by a checkpoint. And indeed I am able to create a snapshot with sqlite3_snapshot_get() [2], then run checkpoints with sqlite3_wal_checkpoint_v2() [3], and finally have sqlite3_snapshot_open() fail. I have a concern, because this turns automatic checkpointing into a mortal danger for all snapshots. As soon as a checkpoint is performed, snapshots are invalidated, and whoever is relying on them has lost the database state he's interested into. My question is: is it possible to prevent checkpoints from completing successfully when a snapshot is alive? I know this is possible with a "hand-made checkpoint", made of a distinct connection which has started a deferred transaction and has performed an initial read. Such hand-made checkpoint has SQLITE_CHECKPOINT_RESTART and SQLITE_CHECKPOINT_TRUNCATE fail, and are still able to access their precious database state after a SQLITE_CHECKPOINT_PASSIVE or a SQLITE_CHECKPOINT_FULL. The behavior of those "hand-made checkpoint" matches well the needs of users who want to use WAL without thinking too much about it: they do not disable automatic checkpointing, and are guaranteed with a stable access to a given database state as long as they need it. I was wondering if such a behavior is possible with snapshots returned from sqlite3_snapshot_get(). If it is not, then I humbly suggest that this feature would be added, and am willing to listen to the opinion of SQLite experts on this subject. Regards, Gwendal Roué [1] https://www.sqlite.org/c3ref/snapshot_open.html [2] https://www.sqlite.org/c3ref/snapshot_get.html [3] https://www.sqlite.org/c3ref/wal_checkpoint_v2.html ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users