[sqlite] Understanding WALs

Paul Sanderson Tue, 10 Nov 2015 23:19:03 +0000

I am trying to get my head around the operation of WAL files and an
oddity that I can't explain.


As I understand it when a transaction is committed when a DB is in WAL
mode each changed page is appended to the WAL. Multiple transactions
work in the same way until a checkpoint occurs.

The WAL has a header which contains 2 salts. For each page (preceeded
by a header and collectively known as a frame) there is a copy of the
two salts from the WAL file header and two checksums which use the two
salts as seeds. For a page/frame in the WAL to be valid the salts in
the frame must match the salts in the file header and when the
checksum is recalulated on the content of the frame/page the checksums
must match/compute.

When a checkpoint occurs all the valid pages (using the rule above)
are written to the DB and salt1 in the WAL file header is incremented
and salt2 is randomised. Writes to the WAL then re-commence from the
begining. As the salts have been updated any pages in the WAL (which
is not truncated) will be invalid as neither of the salts and
therefore the checksums will match.

OK that seems sort of straight forward. So I created a new DB using
the command prompt and issued the following instructions:

pragma journal_mode = wal;
create table test (x int);
pragma wal_checkpoint;
insert into test values(1);
pragma wal_checkpoint

I then had a look at the WAL

The screenshot below shows the WAL file header and you can see the two
salts displayed

http://www.sandersonforensics.com/pics/wal_header.jpg

The screenshot at this link shows the first page in the WAL and you
can see that the two salts in this frame match the salts in the file
header (and that this is a commit frame - DB size in pages > 0).
Please take my word for it that the checksums also compute.

http://www.sandersonforensics.com/pics/wal_page2.jpg

As the last thing I did was a wal-checkpoint then according to the
documentation the page should have been written to the DB (it was see
pic below of page two of the DB) AND the salts incremented and
randomised (section 4.3 of the file format page).

http://www.sandersonforensics.com/pics/db_page2.jpg

For completness there is another page in the WAL, i.e. page 2 again
which in this case is blank and the salts to not match those in the
header and the checksums dont comput - this is what I expected above.

http://www.sandersonforensics.com/pics/db_invalid.jpg

As I said this is an academic question as the integrity of the
database has been maintained and the only (non)issue I see is that
instead of reading the latest page from the DB SQLite will read it
from the WAL - but as they are the same there is no issue.

I am trying to explain how WALs work I can speak with a bit more
authority and what I am seing does conflict with the explanation of
the working of WALs on sqlite.org. From a usability angle I can't see
an issue with this but as it came up while I was working through WAL
usage with a colleague I would like to be able to explain to him what
is going on.

Cheers
Paul


Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
Request a demo here :
http://sandersonforensics.com/forum/content.php?212-RequestDemo

[sqlite] Understanding WALs

Reply via email to