Re: [s3ql] Enable WAL?

Henry Wertz Tue, 28 Sep 2021 00:13:40 -0700


On Sunday, September 12, 2021 at 7:18:14 AM UTC-5 [email protected] wrote:

> Hi Henry, 
>
> A: Because it confuses the reader. 
> Q: Why? 
> A: No. 
> Q: Should I write my response above the quoted reply? 
>
> ..so please quote properly, as I'm doing in the rest of this mail: 
>
>
> On Sep 11 2021, Henry Wertz <[email protected]> wrote: 
> >> > What do you think about enabling WAL (Write Ahead Logging)? 
> >> > 
> >> [...] 
> >> > 
> >> > I didn't benchmark anything, but rsync'ing in small files is visibly 
> >> > faster, and the file system is better under load (i.e. I can copy 
> stuff 
> >> in 
> >> > and any simultaneous directory lookups, copying stuff out, etc. is 
> >> > noticeably faster and more responsive.) 
> >> 
> >> As I understand, WAL should not result in any speed-ups, it just 
> >> improves reliability in case of a crash. So I'd be very interested to 
> >> see actual benchmark data here rather than subjective impressions :-). 
> > 
> > I would think you're write -- you're writing your data into a log, then 
> > writing into the DB, that seems like it'd be slower. But... 
> > 
> > WAL docs (https://sqlite.org/wal.html) say "WAL is significantly faster 
> in 
> > most scenarios." (... to be fair they're comparing it to the regular 
> > journal mode, though, not journal_mode=off.) They say WAL provides more 
> > concurrency (readers and writers don't usually block each other... I"m 
> > [...] 
>
>
> That's the critical point. S3QL currently does not use a journal at 
> all. So enabling WAL just means that the data is written to the journal 
> before (just like now) the database is updated. So it's not clear to me 
> that this willl result in a speedup. Note also that S3QL currently 
> disables fsync() calls on the journal. 
>
>
> Best, 
> -Nikolaus 
>
> -- 
> GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F 
>
> »Time flies like an arrow, fruit flies like a Banana.« 
>
Apologies!   I used to use Pine back in the day but got out of practice 8-).
Anyway, I ran some benchmarks  -- Made an empty filesystem and copied a 
portion of my .cache/bazel into it, a mix of somewhat larger files and a 
few directories of loads of small files.  Then I deleted it, re-copied, and 
re-deleted it.  
synchronous and journal-mode
OFF OFF
run 1:
create 4:15 delete 0:57 
run 2:
create 4:15 delete 0:58

WAL OFF
run 1:
create 4:28 delete 1:03
run 2:
create 4:30 delete 0:59

WAL NORMAL
run 1:
create 4:29 delete 0:59
run 2:
create 4:30 delete 
(I didn't run the last delete since the WAL, synchronous=NORMAL and WAL, 
synchronous=OFF speeds seemed to be identical anyway.)

I was surprised to see identical speeds with WAL/OFF and WAL/NORMAL.  
Interestingly, WAL with sync=NORMAL is slower, but only by about 5%, for 
write-intensive loads.   WAL+normal is supposed to provide stronger 
guarantees of a consistent DB (if the system got interrupted, on next DB 
open sqlite can either complete or roll back transactions in the WAL, to 
make sure the DB is in a consistent state.)  That said, I've used s3ql on a 
few USB drives (one with a habit of having the cable fall out a few times, 
and I had a USB3 interface that had a habit of dropping off now and then 
with older kernels.)  Worst I had was running a sqlite .repiar and fsck, 
losing (unsurprisingly) the last several seconds of whatever I was copying 
in.   In other words I've already found sqlite robust enough with "OFF/OFF" 
setting, plus of course s3ql has the failsafe of having all those metadata 
backups just in case.

I did notice, if i ran "find" on the test set, it took 0.9 seconds in WAL 
mode but 0.3 seconds in OFF mode -- they do note in the docs that WAL can 
be slower for read-intensive loads, for walking through a directory tree 
it's getting 1/3rd the speed!  

I suspect that explains the "faster" rsync performance I observed -- I had 
one rsync copying stuff in and one walking through a directory tree copying 
stuff out, if it was walking through the tree at like 1/3rd the speed I 
suppose the write-intensive rsync would proceed faster, despite total 
filesystem IOPS being lower.

I'll post back in a bit, I've got another patch cooked up.  I decided to 
look into why running the fsck, why the searching for temporary files was 
taking over 2 hours.   I found on my s3ql-data, I had 810,000 directories 
(the "100" through "999" directories, 2 layers deep), but only about 30,000 
with data in them,.  I went into the s3ql-data directory and (with s3ql 
unmounted) ran a "find -type d -exec rmdir {} \+" (and let that run 
overnight, I imagine it took a while.)  This cut the time for find to walk 
through from over 2 hours to about 10 minutes (and under a minute if I 
re-run, I apparently got the directory count low enough it can fit in the 
directory entry cache.)   s3ql currently creates these directories as 
needed, but does not remove them when empty, this patch adds that.

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/s3ql/13ef0acc-534d-494e-a3ea-3acb4f423233n%40googlegroups.com.

Re: [s3ql] Enable WAL?

Reply via email to