Walter, this is really beautiful at least from the performance point of view. My benchmarks shows:
rsync: 3m27s -> 1m20s rm: 2m13s -> 9s so speed increase is here as reported by various wapbl papers which is really nice. Anyway, I'd also like to use that with my implementation of checksumming for SR RAID1. I'm not that lucky like you and under some conditions my code really sucks performance wise on write operation. This is due to design/layout of chksumming but even if I experiment with different layouts (or caching) it still kind of sucks on write. The only way where it does not suck is if the data are written in 32k multiplies and aligned on 32k boundary. This way my implementation is running optimized way hence for write needing only to override chksum block and write data. And this is done w/o collision with other I/O so this way is fast and kind of acceptable in comparison with pure SR RAID1. So here is my question: is there any possibility to convince current WAPBL code to write transasction into log in 32k blocks with 32k alignment? I can of course hack the code if you advice where to test that, I've just so far not been able to find the magic constant of commit size or so. Thanks! Karel PS: for curious my benchmarking shows those numbers. Benchmarked on 2x 500GB WD Re drives, 512 bytes sector size physical. Using my patch to speedup CRC32 calculation using "by-four" version of algorithm (360->970 MB/s speed increase). Comparison of pure SR RAID1 with my chksumming SR RAID1 implementation on a 500GB fs (whole SR RAID drive). Rsync is copying /usr/src into /raid (1.3GB of data (it's a git repo) so bigger than usual CVS checkout), rm just rm /raid/* and find: find /raid -type f cat \{} >/dev/null \; rsync: 3m27s -> 12m15s rm: 2m13s -> 5m35s find+cat: 1m58s -> 2m10s (read benchmark, does not suck IMHO) dd 2GB write: 16s -> 23s (shows potential of write 32k aligned, for me very acceptable result) dd 2GB read: 16s -> 21s