[Kernel-packages] [Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed

2021-10-27 Thread Mason Loring Bliss
(Or Ubuntu systems post-fix but with pools created while the bug was
active - and is there a fix possible, or is it "make a new pool"? Is
there a diagnostic possible to be sure either way?)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1906476

Title:
  PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 ==
  sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED,
  &zp->z_sa_hdl)) failed

Status in Native ZFS for Linux:
  New
Status in linux package in Ubuntu:
  Invalid
Status in ubuntu-release-upgrader package in Ubuntu:
  Confirmed
Status in zfs-linux package in Ubuntu:
  Fix Released
Status in linux source package in Impish:
  Fix Released
Status in ubuntu-release-upgrader source package in Impish:
  Confirmed
Status in zfs-linux source package in Impish:
  Fix Released

Bug description:
  Since today while running Ubuntu 21.04 Hirsute I started getting a ZFS
  panic in the kernel log which was also hanging Disk I/O for all
  Chrome/Electron Apps.

  I have narrowed down a few important notes:
  - It does not happen with module version 0.8.4-1ubuntu11 built and included 
with 5.8.0-29-generic

  - It was happening when using zfs-dkms 0.8.4-1ubuntu16 built with DKMS
  on the same kernel and also on 5.8.18-acso (a custom kernel).

  - For whatever reason multiple Chrome/Electron apps were affected,
  specifically Discord, Chrome and Mattermost. In all cases they seem
  (but I was unable to strace the processes so it was a bit hard ot
  confirm 100% but by deduction from /proc/PID/fd and the hanging ls)
  they seem hung trying to open files in their 'Cache' directory, e.g.
  ~/.cache/google-chrome/Default/Cache and ~/.config/Mattermost/Cache ..
  while the issue was going on I could not list that directory either
  "ls" would just hang.

  - Once I removed zfs-dkms only to revert to the kernel built-in
  version it immediately worked without changing anything, removing
  files, etc.

  - It happened over multiple reboots and kernels every time, all my
  Chrome apps weren't working but for whatever reason nothing else
  seemed affected.

  - It would log a series of spl_panic dumps into kern.log that look like this:
  Dec  2 12:36:42 optane kernel: [   72.857033] VERIFY(0 == 
sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) 
failed
  Dec  2 12:36:42 optane kernel: [   72.857036] PANIC at 
zfs_znode.c:335:zfs_znode_sa_init()

  I could only find one other google reference to this issue, with 2 other 
users reporting the same error but on 20.04 here:
  https://github.com/openzfs/zfs/issues/10971

  - I was not experiencing the issue on 0.8.4-1ubuntu14 and fairly sure
  it was working on 0.8.4-1ubuntu15 but broken after upgrade to
  0.8.4-1ubuntu16. I will reinstall those zfs-dkms versions to verify
  that.

  There were a few originating call stacks but the first one I hit was

  Call Trace:
   dump_stack+0x74/0x95
   spl_dumpstack+0x29/0x2b [spl]
   spl_panic+0xd4/0xfc [spl]
   ? sa_cache_constructor+0x27/0x50 [zfs]
   ? _cond_resched+0x19/0x40
   ? mutex_lock+0x12/0x40
   ? dmu_buf_set_user_ie+0x54/0x80 [zfs]
   zfs_znode_sa_init+0xe0/0xf0 [zfs]
   zfs_znode_alloc+0x101/0x700 [zfs]
   ? arc_buf_fill+0x270/0xd30 [zfs]
   ? __cv_init+0x42/0x60 [spl]
   ? dnode_cons+0x28f/0x2a0 [zfs]
   ? _cond_resched+0x19/0x40
   ? _cond_resched+0x19/0x40
   ? mutex_lock+0x12/0x40
   ? aggsum_add+0x153/0x170 [zfs]
   ? spl_kmem_alloc_impl+0xd8/0x110 [spl]
   ? arc_space_consume+0x54/0xe0 [zfs]
   ? dbuf_read+0x4a0/0xb50 [zfs]
   ? _cond_resched+0x19/0x40
   ? mutex_lock+0x12/0x40
   ? dnode_rele_and_unlock+0x5a/0xc0 [zfs]
   ? _cond_resched+0x19/0x40
   ? mutex_lock+0x12/0x40
   ? dmu_object_info_from_dnode+0x84/0xb0 [zfs]
   zfs_zget+0x1c3/0x270 [zfs]
   ? dmu_buf_rele+0x3a/0x40 [zfs]
   zfs_dirent_lock+0x349/0x680 [zfs]
   zfs_dirlook+0x90/0x2a0 [zfs]
   ? zfs_zaccess+0x10c/0x480 [zfs]
   zfs_lookup+0x202/0x3b0 [zfs]
   zpl_lookup+0xca/0x1e0 [zfs]
   path_openat+0x6a2/0xfe0
   do_filp_open+0x9b/0x110
   ? __check_object_size+0xdb/0x1b0
   ? __alloc_fd+0x46/0x170
   do_sys_openat2+0x217/0x2d0
   ? do_sys_openat2+0x217/0x2d0
   do_sys_open+0x59/0x80
   __x64_sys_openat+0x20/0x30

To manage notifications about this bug go to:
https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1861359] Re: swap storms kills interactive use

2020-04-03 Thread Mason Loring Bliss
Reporter hasn't confirmed that it's corrected yet... "Fix committed"
seems premature.

** Changed in: linux (Ubuntu Focal)
   Status: Fix Committed => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1861359

Title:
  swap storms kills interactive use

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Focal:
  Confirmed

Bug description:
  [Impact]

  High watermark boosting can cause large swap activity under certain
  memory intensive workloads, making the system very unresponsive
  (screen does not refresh, keyboard not responding, etc.).

  This large swap activity seems to be prevented disabling high
  watermark boosting.

  [Test case]

  Opening this web page in chrome seems to be a good reproducer of the
  problem:

  
https://platform.leolabs.space/visualizations/conjunction?type=conjunction&reportId=2004981040

  When this page is opened we can clearly see from 'top' (for example)
  that the used swap is going up very quickly.

  With the fix applied swap is not used at all and the system is always
  responsive.

  [Fix]

  Set vm.watermark_boost_factor to 0, disabling watermark boosting by
  default.

  [Regression potential]

  Regression potential is minimal, setting vm.watermark_boost_factor to
  0 by default restores the old kernel behavior before watermark
  boosting was introduced. In case of unexpected regressions we can
  always fix this in user-space via sysctl.

  [Original report]

  Hello, several times since upgrading to focal from 19.04 I've found my
  computer entirely unresponsive for periods of twenty or thirty
  seconds. No mouse movement, no keyboard input, the screen output does
  not change.

  My computer was using swap space and despite very slow writeout speeds
  well below what the NVME drive can handle, the computer was unusable.

  I've captured some vmstat 1 output and top output that I started
  collecting during the event. (Normally one very long painful period is
  followed by several shorter periods of uselessness.)

  Thanks

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-12-generic 5.4.0-12.15
  ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8
  Uname: Linux 5.4.0-12-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu15
  Architecture: amd64
  Date: Wed Jan 29 23:44:05 2020
  ProcEnviron:
   TERM=rxvt-unicode-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-signed-5.4
  UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago)
  ---
  ProblemType: Bug
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 
k5.4.0-12-generic.
  ApportVersion: 2.20.11-0ubuntu16
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC0:  sarnold2734 F pulseaudio
   /dev/snd/controlC1:  sarnold2734 F pulseaudio
  Card0.Amixer.info:
   Card hw:0 'PCH'/'HDA Intel PCH at 0x2fe1028000 irq 145'
     Mixer name : 'Realtek ALC285'
     Components : 'HDA:10ec0285,17aa225c,0012 
HDA:8086280b,80860101,0010'
     Controls  : 53
     Simple ctrls  : 15
  Card1.Amixer.info:
   Card hw:1 'Audio'/'Generic ThinkPad Dock USB Audio at 
usb-:00:14.0-4.2.4, high speed'
     Mixer name : 'USB Mixer'
     Components : 'USB17ef:306f'
     Controls  : 9
     Simple ctrls  : 4
  DistroRelease: Ubuntu 20.04
  HibernationDevice: RESUME=none
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: LENOVO 20KHCTO1WW
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  ProcEnviron:
   TERM=rxvt-unicode-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/BOOT/ubuntu@/vmlinuz-5.4.0-12-generic 
root=ZFS=rpool/ROOT/ubuntu ro root=ZFS=rpool/ROOT/ubuntu quiet splash 
acpi_osi=! "acpi_osi=Windows 2015" vt.handoff=1
  ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-12-generic N/A
   linux-backports-modules-5.4.0-12-generic  N/A
   linux-firmware1.185
  Tags:  focal
  Uname: Linux 5.4.0-12-generic x86_64
  UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago)
  UserGroups: adm cdrom libvirt lpadmin plugdev sambashare sbuild sudo
  _MarkForUpload: True
  dmi.bios.date: 11/25/2019
  dmi.bios.vendor: LENOVO
  dmi.bios.version: N23ET69W (1.44 )
  dmi.board.asset.tag: Not Available
  dmi.board.name: 20KHCTO1WW
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40709 WIN
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrN23ET69W(1.44):bd11/25/2019

[Kernel-packages] [Bug 1779736] Re: umask ignored on NFSv4.2 mounts

2019-03-07 Thread Mason Loring Bliss
I can confirm that "zfs set acltype=posixacl foo/bar/" is an effective 
workaround. It appears to be
unset by default.

root@box /root# zfs set acltype=posixacl pool/srv/thing
root@box /root# zfs get acltype pool/srv
NAME  PROPERTY  VALUE SOURCE
pool/srv  acltype   off   default
root@box /root# zfs get acltype pool/srv/thing
NAMEPROPERTY  VALUE SOURCE
pool/srv/thing  acltype   posixacl  local

Thanks, Quentin.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779736

Title:
  umask ignored on NFSv4.2 mounts

Status in linux package in Ubuntu:
  Confirmed
Status in nfs-utils package in Ubuntu:
  Confirmed

Bug description:
  After upgrading to kernel 4.15.0-24-generic (on Ubuntu 18.04 LTS)
  NFSv4.2 mounts ignore the umask when creating files and directories.
  Files get permissions 666 and directories get 777.  Therefore, a umask
  of 000 is seemingly being forced when creating files/directories in
  NFS mounts.  Mounting with noacl does not resolve the issue.

  How to replicate:

  1. Mount an NFS share (defaults to NFSv4.2)
  2. Ensure restrictive umask: umask 022
  3. Create directory: mkdir test_dir
  4. Create file: touch test_file
  5. List: ls -l

  The result will be:
  drwxrwxrwx 2 user user 2 Jul  2 12:16 test_dir
  -rw-rw-rw- 1 user user 0 Jul  2 12:16 test_file

  while the expected result would be
  drwxr-xr-x 2 user user 2 Jul  2 12:16 test_dir
  -rw-r--r-- 1 user user 0 Jul  2 12:16 test_file

  Bug does not occur when mounting with any of:
    vers=3
    vers=4.0
    vers=4.1

  I have a suspicion this is related to: 
https://tools.ietf.org/id/draft-ietf-nfsv4-umask-03.html
  But since the server does not have ACL's enabled, and mounting with noacl 
does not resolve the issue this is unexpected behavior.

  Both server and client are running kernel 4.15.0-24-generic on Ubuntu
  18.04 LTS.  NFS package versions are:

  nfs-kernel-server 1:1.3.4-2.1ubuntu5
  nfs-common 1:1.3.4-2.1ubuntu5

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779736/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp