[Kernel-packages] [Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
(Or Ubuntu systems post-fix but with pools created while the bug was active - and is there a fix possible, or is it "make a new pool"? Is there a diagnostic possible to be sure either way?) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed Status in Native ZFS for Linux: New Status in linux package in Ubuntu: Invalid Status in ubuntu-release-upgrader package in Ubuntu: Confirmed Status in zfs-linux package in Ubuntu: Fix Released Status in linux source package in Impish: Fix Released Status in ubuntu-release-upgrader source package in Impish: Confirmed Status in zfs-linux source package in Impish: Fix Released Bug description: Since today while running Ubuntu 21.04 Hirsute I started getting a ZFS panic in the kernel log which was also hanging Disk I/O for all Chrome/Electron Apps. I have narrowed down a few important notes: - It does not happen with module version 0.8.4-1ubuntu11 built and included with 5.8.0-29-generic - It was happening when using zfs-dkms 0.8.4-1ubuntu16 built with DKMS on the same kernel and also on 5.8.18-acso (a custom kernel). - For whatever reason multiple Chrome/Electron apps were affected, specifically Discord, Chrome and Mattermost. In all cases they seem (but I was unable to strace the processes so it was a bit hard ot confirm 100% but by deduction from /proc/PID/fd and the hanging ls) they seem hung trying to open files in their 'Cache' directory, e.g. ~/.cache/google-chrome/Default/Cache and ~/.config/Mattermost/Cache .. while the issue was going on I could not list that directory either "ls" would just hang. - Once I removed zfs-dkms only to revert to the kernel built-in version it immediately worked without changing anything, removing files, etc. - It happened over multiple reboots and kernels every time, all my Chrome apps weren't working but for whatever reason nothing else seemed affected. - It would log a series of spl_panic dumps into kern.log that look like this: Dec 2 12:36:42 optane kernel: [ 72.857033] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed Dec 2 12:36:42 optane kernel: [ 72.857036] PANIC at zfs_znode.c:335:zfs_znode_sa_init() I could only find one other google reference to this issue, with 2 other users reporting the same error but on 20.04 here: https://github.com/openzfs/zfs/issues/10971 - I was not experiencing the issue on 0.8.4-1ubuntu14 and fairly sure it was working on 0.8.4-1ubuntu15 but broken after upgrade to 0.8.4-1ubuntu16. I will reinstall those zfs-dkms versions to verify that. There were a few originating call stacks but the first one I hit was Call Trace: dump_stack+0x74/0x95 spl_dumpstack+0x29/0x2b [spl] spl_panic+0xd4/0xfc [spl] ? sa_cache_constructor+0x27/0x50 [zfs] ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? dmu_buf_set_user_ie+0x54/0x80 [zfs] zfs_znode_sa_init+0xe0/0xf0 [zfs] zfs_znode_alloc+0x101/0x700 [zfs] ? arc_buf_fill+0x270/0xd30 [zfs] ? __cv_init+0x42/0x60 [spl] ? dnode_cons+0x28f/0x2a0 [zfs] ? _cond_resched+0x19/0x40 ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? aggsum_add+0x153/0x170 [zfs] ? spl_kmem_alloc_impl+0xd8/0x110 [spl] ? arc_space_consume+0x54/0xe0 [zfs] ? dbuf_read+0x4a0/0xb50 [zfs] ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? dnode_rele_and_unlock+0x5a/0xc0 [zfs] ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? dmu_object_info_from_dnode+0x84/0xb0 [zfs] zfs_zget+0x1c3/0x270 [zfs] ? dmu_buf_rele+0x3a/0x40 [zfs] zfs_dirent_lock+0x349/0x680 [zfs] zfs_dirlook+0x90/0x2a0 [zfs] ? zfs_zaccess+0x10c/0x480 [zfs] zfs_lookup+0x202/0x3b0 [zfs] zpl_lookup+0xca/0x1e0 [zfs] path_openat+0x6a2/0xfe0 do_filp_open+0x9b/0x110 ? __check_object_size+0xdb/0x1b0 ? __alloc_fd+0x46/0x170 do_sys_openat2+0x217/0x2d0 ? do_sys_openat2+0x217/0x2d0 do_sys_open+0x59/0x80 __x64_sys_openat+0x20/0x30 To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1861359] Re: swap storms kills interactive use
Reporter hasn't confirmed that it's corrected yet... "Fix committed" seems premature. ** Changed in: linux (Ubuntu Focal) Status: Fix Committed => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1861359 Title: swap storms kills interactive use Status in linux package in Ubuntu: Confirmed Status in linux source package in Focal: Confirmed Bug description: [Impact] High watermark boosting can cause large swap activity under certain memory intensive workloads, making the system very unresponsive (screen does not refresh, keyboard not responding, etc.). This large swap activity seems to be prevented disabling high watermark boosting. [Test case] Opening this web page in chrome seems to be a good reproducer of the problem: https://platform.leolabs.space/visualizations/conjunction?type=conjunction&reportId=2004981040 When this page is opened we can clearly see from 'top' (for example) that the used swap is going up very quickly. With the fix applied swap is not used at all and the system is always responsive. [Fix] Set vm.watermark_boost_factor to 0, disabling watermark boosting by default. [Regression potential] Regression potential is minimal, setting vm.watermark_boost_factor to 0 by default restores the old kernel behavior before watermark boosting was introduced. In case of unexpected regressions we can always fix this in user-space via sysctl. [Original report] Hello, several times since upgrading to focal from 19.04 I've found my computer entirely unresponsive for periods of twenty or thirty seconds. No mouse movement, no keyboard input, the screen output does not change. My computer was using swap space and despite very slow writeout speeds well below what the NVME drive can handle, the computer was unusable. I've captured some vmstat 1 output and top output that I started collecting during the event. (Normally one very long painful period is followed by several shorter periods of uselessness.) Thanks ProblemType: Bug DistroRelease: Ubuntu 20.04 Package: linux-image-5.4.0-12-generic 5.4.0-12.15 ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8 Uname: Linux 5.4.0-12-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu15 Architecture: amd64 Date: Wed Jan 29 23:44:05 2020 ProcEnviron: TERM=rxvt-unicode-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: linux-signed-5.4 UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago) --- ProblemType: Bug AlsaVersion: Advanced Linux Sound Architecture Driver Version k5.4.0-12-generic. ApportVersion: 2.20.11-0ubuntu16 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC0: sarnold2734 F pulseaudio /dev/snd/controlC1: sarnold2734 F pulseaudio Card0.Amixer.info: Card hw:0 'PCH'/'HDA Intel PCH at 0x2fe1028000 irq 145' Mixer name : 'Realtek ALC285' Components : 'HDA:10ec0285,17aa225c,0012 HDA:8086280b,80860101,0010' Controls : 53 Simple ctrls : 15 Card1.Amixer.info: Card hw:1 'Audio'/'Generic ThinkPad Dock USB Audio at usb-:00:14.0-4.2.4, high speed' Mixer name : 'USB Mixer' Components : 'USB17ef:306f' Controls : 9 Simple ctrls : 4 DistroRelease: Ubuntu 20.04 HibernationDevice: RESUME=none IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: LENOVO 20KHCTO1WW NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) ProcEnviron: TERM=rxvt-unicode-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/BOOT/ubuntu@/vmlinuz-5.4.0-12-generic root=ZFS=rpool/ROOT/ubuntu ro root=ZFS=rpool/ROOT/ubuntu quiet splash acpi_osi=! "acpi_osi=Windows 2015" vt.handoff=1 ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8 RelatedPackageVersions: linux-restricted-modules-5.4.0-12-generic N/A linux-backports-modules-5.4.0-12-generic N/A linux-firmware1.185 Tags: focal Uname: Linux 5.4.0-12-generic x86_64 UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago) UserGroups: adm cdrom libvirt lpadmin plugdev sambashare sbuild sudo _MarkForUpload: True dmi.bios.date: 11/25/2019 dmi.bios.vendor: LENOVO dmi.bios.version: N23ET69W (1.44 ) dmi.board.asset.tag: Not Available dmi.board.name: 20KHCTO1WW dmi.board.vendor: LENOVO dmi.board.version: SDK0J40709 WIN dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: None dmi.modalias: dmi:bvnLENOVO:bvrN23ET69W(1.44):bd11/25/2019
[Kernel-packages] [Bug 1779736] Re: umask ignored on NFSv4.2 mounts
I can confirm that "zfs set acltype=posixacl foo/bar/" is an effective workaround. It appears to be unset by default. root@box /root# zfs set acltype=posixacl pool/srv/thing root@box /root# zfs get acltype pool/srv NAME PROPERTY VALUE SOURCE pool/srv acltype off default root@box /root# zfs get acltype pool/srv/thing NAMEPROPERTY VALUE SOURCE pool/srv/thing acltype posixacl local Thanks, Quentin. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779736 Title: umask ignored on NFSv4.2 mounts Status in linux package in Ubuntu: Confirmed Status in nfs-utils package in Ubuntu: Confirmed Bug description: After upgrading to kernel 4.15.0-24-generic (on Ubuntu 18.04 LTS) NFSv4.2 mounts ignore the umask when creating files and directories. Files get permissions 666 and directories get 777. Therefore, a umask of 000 is seemingly being forced when creating files/directories in NFS mounts. Mounting with noacl does not resolve the issue. How to replicate: 1. Mount an NFS share (defaults to NFSv4.2) 2. Ensure restrictive umask: umask 022 3. Create directory: mkdir test_dir 4. Create file: touch test_file 5. List: ls -l The result will be: drwxrwxrwx 2 user user 2 Jul 2 12:16 test_dir -rw-rw-rw- 1 user user 0 Jul 2 12:16 test_file while the expected result would be drwxr-xr-x 2 user user 2 Jul 2 12:16 test_dir -rw-r--r-- 1 user user 0 Jul 2 12:16 test_file Bug does not occur when mounting with any of: vers=3 vers=4.0 vers=4.1 I have a suspicion this is related to: https://tools.ietf.org/id/draft-ietf-nfsv4-umask-03.html But since the server does not have ACL's enabled, and mounting with noacl does not resolve the issue this is unexpected behavior. Both server and client are running kernel 4.15.0-24-generic on Ubuntu 18.04 LTS. NFS package versions are: nfs-kernel-server 1:1.3.4-2.1ubuntu5 nfs-common 1:1.3.4-2.1ubuntu5 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp