What do you think about enabling WAL (Write Ahead Logging)?

I see it was disabled in 2010 for good reason (WAL file ballooning out to 
multi-GB size) and poor s3qlcp speeds (not surprising if it was generating 
that much write traffic..).  But the WAL ballooning behavior was apparenlty 
changed/fixed in sqlite3 3.11.x (about 6 years ago.)  I applied the 
attached patch (which just switches from current setting of "PRAGMA 
journal_mode = OFF" and "PRAGMA synchronous = OFF" to the commented-out 
journal_mode WAL and synchronous NORMAL in database.py. 

Could of course do a version check and only enable if you're on 3.11 or 
higher, if there is concern about older sqlite3 versions floating around.

I'm running local:// backend to 3 USB hard disks (these are 3 seperate s3ql 
file systems) with ext4, I'd guess with local storage the database speed 
may influence total speed more compared to when you have some remote S3 
storage backing things up.

I didn't benchmark anything, but rsync'ing in small files is visibly 
faster, and the file system is better under load (i.e. I can copy stuff in 
and any simultaneous directory lookups, copying stuff out, etc. is 
noticeably faster and more responsive.)   I think the writeback tasks are 
finishing faster too (not really for 10MB blocks, they're probably 
dominated by compression time, but for smaller files and duplicates.)  The 
WAL file doesn't balloon too much, I have it grow to 99MB and stop there (I 
did put one FS under enough stress Linux's write cache started building up, 
the disk was not keeping up..  at some point there the WAL did increase to 
160MB, so I guess it grows a little over 99MB under heavy load, but not by 
some crazy amount.)    The 99MB seems fairly constant (as opposed to being 
based on DB size), one of my s3ql systems has a 1.8GB DB, one is like 
750MB, both have 99MB WAL files after they're mounted for a while.  s3qlcp 
on a directory with like 3000 files was a bit slow (but I don't have 
anything to compare it to..), but s3qlcp on this directory with about 300GB 
of VMs in it, it still took some seconds but a lot faster than before (the 
WAL didn't go past 99MB with either s3qlcp test).  As umount.s3ql does it's 
thing the WAL does grow to be about the same size as the DB (momentarily, 
just after the WAL reaches peak size I assume the DB is closed, the WAL is 
gone a second or two later).  (It did this on all 3 disks I have with s3ql 
local:// over ext4 on them.)

Thanks!
--Henry

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/s3ql/6cebd7a2-605f-4455-bb6a-a839562420f0n%40googlegroups.com.
Description: <short summary of the patch>
 TODO: Put a short summary on the line above and replace this paragraph
 with a longer explanation of this change. Complete the meta-information
 with other relevant fields (see below for details). To make it easier, the
 information below has been extracted from the changelog. Adjust it or drop
 it.
 .
 s3ql (3.7.0+dfsg-2build1) hirsute; urgency=medium
 .
   * No change rebuild with fixed ownership.
Author: Dimitri John Ledkov <[email protected]>

---
The information above should follow the Patch Tagging Guidelines, please
checkout http://dep.debian.net/deps/dep3/ to learn about the format. Here
are templates for supplementary fields that you might want to add:

Origin: <vendor|upstream|other>, <url of original patch>
Bug: <url in upstream bugtracker>
Bug-Debian: https://bugs.debian.org/<bugnumber>
Bug-Ubuntu: https://launchpad.net/bugs/<bugnumber>
Forwarded: <no|not-needed|url proving that it has been forwarded>
Reviewed-By: <name and email of someone who approved the patch>
Last-Update: 2021-05-07

--- s3ql-3.7.2.orig/src/s3ql/database.py
+++ s3ql-3.7.2/src/s3ql/database.py
@@ -30,10 +30,13 @@ initsql = (
            # However, if we start using it we must initiaze it *before* setting
            # locking_mode to EXCLUSIVE, otherwise we can't switch the locking
            # mode without first disabling WAL.
-           'PRAGMA synchronous = OFF',
-           'PRAGMA journal_mode = OFF',
-           #'PRAGMA synchronous = NORMAL',
-           #'PRAGMA journal_mode = WAL',
+           #
+           #This works fine, as of sqlite 3.11.x which was quite a while ago
+           #(it's up to 3.31.x at the moment).  -- HTW
+           #'PRAGMA synchronous = OFF',
+           #'PRAGMA journal_mode = OFF',
+           'PRAGMA synchronous = NORMAL',
+           'PRAGMA journal_mode = WAL',
 
            'PRAGMA foreign_keys = OFF',
            'PRAGMA locking_mode = EXCLUSIVE',

Reply via email to