[sqlite] Why is a separate journal file needed ?
On 29 February 2016 at 12:13, Igor Tandetnik wrote: > On 2/28/2016 9:19 PM, Rowan Worth wrote: > >> On 27 February 2016 at 00:02, Igor Tandetnik wrote: > > > > I simplified to make the main point stand out: it is not true that the >> >>> database file remains "pristine" while the transaction is in progress, >>> and >>> changes are written only to the journal file. Instead, to the first >>> approximation, the opposite is true. >>> >> >> I agree it is not true in general, but it is mostly accurate for small >> transactions >> > > No, it's never true (at least, not with the rollback journal - WAL is > different still). Changes are not written to the journal; original > unmodified pages are. > You are of course entirely correct that changes are not written to the journal. I only meant the statement "the database file remains 'pristine' while the transaction in progress" is true for a small transaction, right up until commit time. Apologies if this discussion seems pointless/obvious from your perspective. My intent was to clarify the operation of rollback mode, as the original simplification made it sound like the main database file and journal are *immediately* modified during an INSERT/UPDATE/etc, rather than this step being deferred until COMMIT/cache-spill. Not sure if I've succeded :) -Rowan
[sqlite] Why is a separate journal file needed ?
On 27 February 2016 at 00:02, Igor Tandetnik wrote: > On 2/26/2016 4:01 AM, Rowan Worth wrote: > >> In principle this is correct, but actually the database *file* is not >> immediately modified in rollback mode. Instead when a page is modified the >> original contents are saved to the rollback journal, and the page is >> updated *in memory*. >> > > ... until such time as the cache needs to be spilled - then it's updated > in the database file. In fact, I'm pretty sure the rollback journal is not > created as long as all the changes are entirely in RAM. > The rollback journal is always created before the main database file is modified. This must happen for durability's sake - if the changes were only held in RAM then sqlite would have no way to recover a partially modified database caused by a power failure mid-transaction. I'm not sure whether modified pages are written to the journal file immediately or whether that also happens at COMMIT/cache-spill time. I simplified to make the main point stand out: it is not true that the > database file remains "pristine" while the transaction is in progress, and > changes are written only to the journal file. Instead, to the first > approximation, the opposite is true. > I agree it is not true in general, but it is mostly accurate for small transactions (where "small" means less than sqlite's memory cache). The default cache size is 2,000 pages, I'd imagine it depends a lot on the workload involved as to what percentage of transactions fit in the cache. -Rowan
[sqlite] Why is a separate journal file needed ?
On 2/28/2016 9:19 PM, Rowan Worth wrote: > On 27 February 2016 at 00:02, Igor Tandetnik wrote: > >> On 2/26/2016 4:01 AM, Rowan Worth wrote: >> >>> In principle this is correct, but actually the database *file* is not >>> immediately modified in rollback mode. Instead when a page is modified the >>> original contents are saved to the rollback journal, and the page is >>> updated *in memory*. >>> >> >> ... until such time as the cache needs to be spilled - then it's updated >> in the database file. In fact, I'm pretty sure the rollback journal is not >> created as long as all the changes are entirely in RAM. >> > > The rollback journal is always created before the main database file is > modified. This must happen for durability's sake - if the changes were only > held in RAM then sqlite would have no way to recover a partially modified > database caused by a power failure mid-transaction. By definition, as long as "all the changes are entirely in RAM", the main database file has not yet been modified. After all, if it has been modified, then some changes are already outside of RAM. In light of this rather obvious observation, I'm not quite sure what point you were trying to make with this paragraph. > > I simplified to make the main point stand out: it is not true that the >> database file remains "pristine" while the transaction is in progress, and >> changes are written only to the journal file. Instead, to the first >> approximation, the opposite is true. > > I agree it is not true in general, but it is mostly accurate for small > transactions No, it's never true (at least, not with the rollback journal - WAL is different still). Changes are not written to the journal; original unmodified pages are. -- Igor Tandetnik
[sqlite] Why is a separate journal file needed ?
On 24 February 2016 at 23:46, Igor Tandetnik wrote: > On 2/24/2016 10:08 AM, Stephen Chrzanowski wrote: > >> IMO, all that, plus the fact that you have an easy roll back mechanism. >> Anything that needs to be put in the database is external to the pristine >> database. Lock the database with a transaction, fill up the journal, the >> power goes out, your pristine database isn't touched. >> > > Rollback journal works the other way round. New data is written to > database file; previous content of overwritten pages is copied to journal > file. In principle this is correct, but actually the database *file* is not immediately modified in rollback mode. Instead when a page is modified the original contents are saved to the rollback journal, and the page is updated *in memory*. > Committing a transaction is fast - simply delete the journal. So this is not quite true. It's only[1] at commit time that the pages modified in memory are paged back to disk, and this concentration of I/O makes commit relatively expensive. But there is a reason for this - it improves concurrency. As you point out, the database file is not "pristine" after the first modified page is written to disk, until the journal file is deleted (which marks the end of the commit phase and finalises the transaction). Clearly other processes must be locked out of the database for this phase. By deferring modification on disk as long as possible, sqlite maximises the amount of time other processes can continue to read the database. [1] or if its memory cache is exceeded, sqlite will obtain the EXCLUSIVE lock and modify the database before commit time. See pragmas cache_size and cache_spill. -Rowan
[sqlite] Why is a separate journal file needed ?
On 24 February 2016 at 21:49, Richard Hipp wrote: > On 2/24/16, Simon Slavin wrote: > > Why can't the information which SQLite > > stores in a journal file be put in the database file ? > > Doing so would double the size of the database file. Every database > file would contain extra space (normally unused) set aside for the > journal. > Surely this extra space would be proportional to the size of the largest [recent] transaction, rather than the size of the database itself? To be specific I'm thinking of rollback journals, I don't have experience with WAL mode. -Rowan
[sqlite] Why is a separate journal file needed ?
On 2/26/2016 4:01 AM, Rowan Worth wrote: > On 24 February 2016 at 23:46, Igor Tandetnik wrote: > >> On 2/24/2016 10:08 AM, Stephen Chrzanowski wrote: >> >>> IMO, all that, plus the fact that you have an easy roll back mechanism. >>> Anything that needs to be put in the database is external to the pristine >>> database. Lock the database with a transaction, fill up the journal, the >>> power goes out, your pristine database isn't touched. >>> >> >> Rollback journal works the other way round. New data is written to >> database file; previous content of overwritten pages is copied to journal >> file. > > > In principle this is correct, but actually the database *file* is not > immediately modified in rollback mode. Instead when a page is modified the > original contents are saved to the rollback journal, and the page is > updated *in memory*. ... until such time as the cache needs to be spilled - then it's updated in the database file. In fact, I'm pretty sure the rollback journal is not created as long as all the changes are entirely in RAM. I simplified to make the main point stand out: it is not true that the database file remains "pristine" while the transaction is in progress, and changes are written only to the journal file. Instead, to the first approximation, the opposite is true. >> Committing a transaction is fast - simply delete the journal. > > > So this is not quite true. It's only[1] at commit time that the pages > modified in memory are paged back to disk, and this concentration of I/O > makes commit relatively expensive. Yes, I again simplified so as not to distract from the main point. Essentially, commit forces spilling the cache. -- Igor Tandetnik
[sqlite] Why is a separate journal file needed ?
I have just checked a twitter database from a library of test DBs - the DB is 88Kb and the associated WAL is 4012Kb similarly I have a Safari history.DB that is 294Kb and associated WAL that is 3974Kb. these are the bigger ones in my test library but they are real world databases Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 26 February 2016 at 09:32, Paul Sanderson wrote: > WAL files can be many times bigger than a database - the default WAL > checkpoint size is when the WAL grows to > 1000 pages. You can get a > DB (for example) with 100 pages and a WAL of 1000 (with multiple > different copies of the same page). > Paul > www.sandersonforensics.com > skype: r3scue193 > twitter: @sandersonforens > Tel +44 (0)1326 572786 > http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit > -Forensic Toolkit for SQLite > email from a work address for a fully functional demo licence > > > On 26 February 2016 at 08:46, Rowan Worth wrote: >> On 24 February 2016 at 21:49, Richard Hipp wrote: >> >>> On 2/24/16, Simon Slavin wrote: >>> > Why can't the information which SQLite >>> > stores in a journal file be put in the database file ? >>> >>> Doing so would double the size of the database file. Every database >>> file would contain extra space (normally unused) set aside for the >>> journal. >>> >> >> Surely this extra space would be proportional to the size of the largest >> [recent] transaction, rather than the size of the database itself? To be >> specific I'm thinking of rollback journals, I don't have experience with >> WAL mode. >> >> -Rowan >> ___ >> sqlite-users mailing list >> sqlite-users at mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Why is a separate journal file needed ?
WAL files can be many times bigger than a database - the default WAL checkpoint size is when the WAL grows to > 1000 pages. You can get a DB (for example) with 100 pages and a WAL of 1000 (with multiple different copies of the same page). Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 26 February 2016 at 08:46, Rowan Worth wrote: > On 24 February 2016 at 21:49, Richard Hipp wrote: > >> On 2/24/16, Simon Slavin wrote: >> > Why can't the information which SQLite >> > stores in a journal file be put in the database file ? >> >> Doing so would double the size of the database file. Every database >> file would contain extra space (normally unused) set aside for the >> journal. >> > > Surely this extra space would be proportional to the size of the largest > [recent] transaction, rather than the size of the database itself? To be > specific I'm thinking of rollback journals, I don't have experience with > WAL mode. > > -Rowan > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Why is a separate journal file needed ?
On Wed, 24 Feb 2016 12:53:48 + Simon Slavin wrote: > In case you're new to this group, please understand that I'm > experienced with SQLite and I understand how it works and how it uses > journal files. > > Can someone explain to me -- or point at a web page which does -- why > a separate journal file is needed. Why can't the information which > SQLite stores in a journal file be put in the database file ? Given > how SQLite structures its database files this would presumably mean > that the main database would have journal pages as well as table and > index pages. Or that there would be 'stable' table and index pages > and 'journal' table and index pages. There are two too bad side effects putting the journal file inside the database (at end, in the middle, at begining, no matter). a) If you add the journal at the end of the db file, and insert new data, it probably will be after the journal. When checkpointing or commiting the journal pages converts to empty pages, quickly filled with new data on next commit, but making internal data fragmentation worse. b) Modify fs i-nodes and directory db file information. These are delicate structures and, on a crash or a file system problem, there are bigger chances to corrupt them if you are continuosly modifying them. Where is the journal data if you or the fs can't find the end of the db file?. Better is modify i-nodes and directory metadata only when needed. Of course, b) is unlikely, but a) is a real pita. > Simon. --- --- Eduardo Morras
[sqlite] Why is a separate journal file needed ?
You would need some sort of allocation table for the journal (wal or rollback) and on a rollback commit or a wal checkpoint the allocation would need to be cleared (or maybe some "in use" bit cleared) to show that the pages were now free to be re-used. The additional data stored with rollback journals (page number before and checksum after) and wals (wal frame before) would also need to be implemented else where and thereofre add extra logic. It sounds doable but more complex. Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 24 February 2016 at 15:46, Igor Tandetnik wrote: > On 2/24/2016 10:08 AM, Stephen Chrzanowski wrote: >> >> IMO, all that, plus the fact that you have an easy roll back mechanism. >> Anything that needs to be put in the database is external to the pristine >> database. Lock the database with a transaction, fill up the journal, the >> power goes out, your pristine database isn't touched. > > > Rollback journal works the other way round. New data is written to database > file; previous content of overwritten pages is copied to journal file. > Committing a transaction is fast - simply delete the journal. Rolling back > (e.g. after power failure) means copying original pages from the journal > back to the database file. > > In the world you describe - how would you implement committing a > transaction? At some point, something must write to the database file; at > that point, the file is no longer "pristine". What happens if power goes out > in the middle of this? > > Note also that a typical application commits much more often than it rolls > back, so the system should be designed to make the former fast; the latter > can be slow. > -- > Igor Tandetnik > > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Why is a separate journal file needed ?
On 2016/02/24 3:49 PM, Richard Hipp wrote: > On 2/24/16, Simon Slavin wrote: >> Why can't the information which SQLite >> stores in a journal file be put in the database file ? > Doing so would double the size of the database file. Every database > file would contain extra space (normally unused) set aside for the > journal. I suppose there is also the problem of recursion. Kinda like Java had where every string is an object, and every object has a name, which is a string, which is an object, which has a name, which is a string, which is an object... ad-infinitum so that they had put in a mechanism for not recursing objects for strings of object-names. If the journal file to a DB is a DB, which has a journal, which is a DB, which has a journal... etc. etc. Q: What does the "B" in "Benoit B. Mandelbrot" stand for? A: "Benoit B. Mandelbrot". :) Ryan
[sqlite] Why is a separate journal file needed ?
Good point! I can assume that the problem would be a sparsity of database file. If you mix normal pages and journal then database will be fragmented. You can't avoid it. Even if you start writing journal at the end of file, hoping that you can later truncate it at commit, there is no way to know how far, from the current end of file, should journal file start. At some point, if transaction is big enough, new allocated pages in database will hit the spot in file where journal pages are. 24 February 2016, 14:53:55, by "Simon Slavin" : > In case you're new to this group, please understand that I'm experienced > with SQLite and I understand how it works and how it uses journal files. > > Can someone explain to me -- or point at a web page which does -- why a > separate journal file is needed. Why can't the information which SQLite > stores in a journal file be put in the database file ? Given how SQLite > structures its database files this would presumably mean that the main > database would have journal pages as well as table and index pages. Or that > there would be 'stable' table and index pages and 'journal' table and index > pages. > > Simon. > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Why is a separate journal file needed ?
In case you're new to this group, please understand that I'm experienced with SQLite and I understand how it works and how it uses journal files. Can someone explain to me -- or point at a web page which does -- why a separate journal file is needed. Why can't the information which SQLite stores in a journal file be put in the database file ? Given how SQLite structures its database files this would presumably mean that the main database would have journal pages as well as table and index pages. Or that there would be 'stable' table and index pages and 'journal' table and index pages. Simon.
[sqlite] Why is a separate journal file needed ?
On 2/24/2016 10:08 AM, Stephen Chrzanowski wrote: > IMO, all that, plus the fact that you have an easy roll back mechanism. > Anything that needs to be put in the database is external to the pristine > database. Lock the database with a transaction, fill up the journal, the > power goes out, your pristine database isn't touched. Rollback journal works the other way round. New data is written to database file; previous content of overwritten pages is copied to journal file. Committing a transaction is fast - simply delete the journal. Rolling back (e.g. after power failure) means copying original pages from the journal back to the database file. In the world you describe - how would you implement committing a transaction? At some point, something must write to the database file; at that point, the file is no longer "pristine". What happens if power goes out in the middle of this? Note also that a typical application commits much more often than it rolls back, so the system should be designed to make the former fast; the latter can be slow. -- Igor Tandetnik
[sqlite] Why is a separate journal file needed ?
IMO, all that, plus the fact that you have an easy roll back mechanism. Anything that needs to be put in the database is external to the pristine database. Lock the database with a transaction, fill up the journal, the power goes out, your pristine database isn't touched. Just whatever happens in the journal might get damaged due to caching at the memory or HDD level. On Wed, Feb 24, 2016 at 8:50 AM, Paul wrote: > Good point! > > I can assume that the problem would be a sparsity of database file. If you > mix normal pages and journal then database will be fragmented. > > You can't avoid it. Even if you start writing journal at the end of file, > hoping that you can later truncate it at commit, there is no way to know > how far, from the current end of file, should journal file start. At some > point, if transaction is big enough, new allocated pages in database will > hit the spot in file where journal pages are. > > > 24 February 2016, 14:53:55, by "Simon Slavin" : > > > In case you're new to this group, please understand that I'm > experienced with SQLite and I understand how it works and how it uses > journal files. > > > > Can someone explain to me -- or point at a web page which does -- why a > separate journal file is needed. Why can't the information which SQLite > stores in a journal file be put in the database file ? Given how SQLite > structures its database files this would presumably mean that the main > database would have journal pages as well as table and index pages. Or > that there would be 'stable' table and index pages and 'journal' table and > index pages. > > > > Simon. > > ___ > > sqlite-users mailing list > > sqlite-users at mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >
[sqlite] Why is a separate journal file needed ?
On 2/24/16, Simon Slavin wrote: > Why can't the information which SQLite > stores in a journal file be put in the database file ? Doing so would double the size of the database file. Every database file would contain extra space (normally unused) set aside for the journal. -- D. Richard Hipp drh at sqlite.org