** Description changed:

+ [Impact]
+ 
+ With mmap()ed files on ext4's data journaling it's possible to change
+ a mapped page's buffers contents during their jbd2 transaction commit
+ (as currently nothing prevents/blocks the write access at that time.)
+ 
+ This might happen between the buffers checksum calculation and actual
+ write to journal, so the (old) checksum is invalid for the (new) data.
+ 
+ If the system crashes after that, but before such journal entry makes
+ it to the filesystem, the journal replay on the next mount just fails,
+ and the filesystem now requires fsck. (apparently curtin might set up
+ /etc/fstab with passno=0, requiring manual intervention.)
+ 
+     [39751.096455] EXT4-fs: Warning: mounting with data=journal disables 
delayed allocation and O_DIRECT support!
+     [39751.114435] JBD2: Invalid checksum recovering block 87305 in log
+     [39751.146133] JBD2: Invalid checksum recovering block 88039 in log
+     [39751.195950] JBD2: Invalid checksum recovering block 49633 in log
+     [39751.265158] JBD2: recovery failed
+     [39751.265163] EXT4-fs (vdc): error loading journal
+ 
+ [Fix]
+ 
+ The fix is to write-protect the pages during journal transaction commit,
+ so that writes to mapped pages hit a page fault, then ext4's page_mkwrite
+ hook can block until the commit finishes and the buffers can be modified.
+ 
+ In order to do that, add jbd2 journal callbacks that the filesystems can
+ customize, called before/after the critical region in transaction commit,
+ then have ext4 in data journaling mode to write-protect the pages whose
+ buffers are being committed (and handle cases that need pages redirtied.)
+ 
+ The changes are restricted to the data journaling mode and page_mkwrite
+ hook, and other modes/paths use the same code/behavior in the callbacks.
+ 
+ [Test Case]
+ 
+ Set up an ext4 filesystem in data journaling mode, and run stress-ng's
+ mmap file test on it, then crash the system after a bit; check whether
+ the filesystem can mount again or not (i.e., with jbd2 checksum errors.)
+ 
+     # mkfs.ext4 $DEV
+     # mount -o data=journal $DEV $DIR
+     # cd $DIR
+     # stress-ng --mmap $((4*$(nproc))) --mmap-file &
+     # sleep 60
+     # echo c >/proc/sysrq-trigger
+     ...
+     # mount -o data=journal $DEV $DIR   # PASS/FAIL.
+     # dmesg | tail
+ 
+ [Regression Potential]
+ 
+ Regressions would likely manifest in ext4 data journaling mode (which
+ is not the default mode, 'ordered') with memory mapped access, as the
+ other modes/paths are largely unaffected by the changes/same behavior.
+ 
+ This has been tested with (x)fstests, that showed no regressions on
+ data=ordered and data=journal on both Bionic and Focal (with kernel
+ versions 4.15.0-156-generic and 5.4.0-84-generic) w/in 10 runs each.
+ And the stress-ng test-case as well. (Numbers/details in the LP bug.)
+ 
+ [Other info]
+ 
+ The patchset is applied on 5.10, so Hirsute (5.11) is already fixed;
+ only Focal and Bionic need it.
+ 
+ There are little changes in the patches between Focal and Bionic
+ (mostly minor backport adjustments, mainly due to no vm_fault_t)
+ but unfortunately that needs separate versions for most patches.
+ 
+ ...
+ 
+ 
+ [Original Bug Description]
+ 
  [Impact]
  In the event of a loss of power, ext4 filesystems mounted w/ 
data=journal,journal_checksum are subject to a corruption issue that requires a 
fsck to recover. This is exacerbated by installations by curtin that set 
passno=0 in /etc/fstab, preventing fsck from running automatically and thus 
requiring a manual recovery. And *that* is further exacerbated because 
initramfs-tools is smart enough to not include fsck.ext4 when passno=0 is 
detected in /etc/fstab, requiring the user to boot from recovery media.
  
  [Test Case]
  Forcibly power cycle a system running 'stress-ng --dir 0'. I've created a 
package to automate the reproduction:
  https://git.launchpad.net/~dannf/+git/dgx2-ext4-csum-repro?h=master
  
  [Fix]
  [Regression Risk]

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1847340

Title:
  ext4 journal recovery fails w/ data=journal + journal_checksum + mmap

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1847340/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to