Have created a 100% reliable reproducer test case and also determined the Ubuntu-specific patch 4701-enable-ARC-FILL-LOCKED-flag.patch to fix Bug #1900889 is likely the cause.
[Test Case] The important parts are: - Use encryption - rsync the zfs git tree - Use parallel I/O from silversearcher-ag to access it after a reboot. A simple "find ." or "find . -exec cat {} > /dev/null \;" does not reproduce the issue. Reproduction done using a libvirt VM installed from the Ubuntu Impish daily livecd using a normal ext4 root but with a second 4GB /dev/vdb disk for zfs later = Preparation apt install silversearcher-ag git zfs-dkms zfsutils-linux echo -n testkey2 > /root/testkey git clone https://github.com/openzfs/zfs /root/zfs = Test Execution zpool create test /dev/vdb zfs create test/test -o encryption=on -o keyformat=passphrase -o keylocation=file:///root/testkey rsync -va --progress -HAX /root/zfs/ /test/test/zfs/ # If you access the data now it works fine. reboot zfs load-key test/test zfs mount -a cd /test/test/zfs/ ag DISKS= = Test Result ag hangs, "sudo dmesg" shows an exception [Analysis] I rebuilt the zfs-linux 2.0.6-1ubuntu1 package from ppa:colin-king/zfs-impish without the Ubuntu-specific patch ubuntu/4701-enable-ARC-FILL-LOCKED-flag.patch which fixed Bug #1900889. With this patch disabled the issue does not reproduce. Re-enabling the patch it reproduces reliably every time again. Seems this bug was never sent upstream. No code changes upstream setting the flag ARC_FILL_IN_PLACE appear to have been added since that I can see however interestingly the code for this ARC_FILL_IN_PLACE handling was added to fix a similar sounding issue "Raw receive fix and encrypted objset security fix" in https://github.com/openzfs/zfs/commit/69830602de2d836013a91bd42cc8d36bbebb3aae . This first shipped in zfs 0.8.0 and the original bug was filed against 0.8.3. I also have found the same issue as the original Launchpad bug reported upstream without any fixes and a lot of discussion (and quite a few duplicates linking back to 11679): https://github.com/openzfs/zfs/issues/11679 https://github.com/openzfs/zfs/issues/12014 Without fully understanding the ZFS code in relation to this flag, the code at https://github.com/openzfs/zfs/blob/ce2bdcedf549b2d83ae9df23a3fa0188b33327b7/module/zfs/arc.c#L2026 talks about how this flag is to do with decrypting blocks in the ARC and doing so 'inplace'. It makes some sense thus that I need encryption to reproduce it and it works best after a reboot (thus flushing the ARC) and why I can still read the data in the test case before doing a reboot when it then fails. This patch was added in 0.8.4-1ubuntu15 and I first experienced the issue somewhere between 0.8.4-1ubuntu11 and 0.8.4-1ubuntu16. So it all adds up and I suggest that this patch should be reverted. ** Bug watch added: github.com/openzfs/zfs/issues #11679 https://github.com/openzfs/zfs/issues/11679 ** Bug watch added: github.com/openzfs/zfs/issues #12014 https://github.com/openzfs/zfs/issues/12014 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs