Removing all dkms, or installing and compiling all dkms are the only two
viable options.

Incomplete installs or mix&match of both can lead to issues.

More recently zfs upstream switched to not shipping many small modules
precisely because of issues like these.

** Changed in: zfs-linux (Ubuntu)
     Assignee: Dimitri John Ledkov (xnox) => (unassigned)

** Changed in: zfs-linux (Ubuntu)
       Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1920956

Title:
  ZFS module cannot be loaded after Kernel 5.4.0-67-generic release AND
  if the system has zfs-dkms and spl-dkms packages installed

Status in zfs-linux package in Ubuntu:
  Invalid

Bug description:
  WARNING: The current bug will only occur under the following
  conditions:

  1- You have a system running on Bionic Release (18.04). Other systems seems 
to be unaffected for now;
  2- This system have 'spl-dkms' and 'zfs-dkms' packages installed. This 
triggers both SPL and ZFS to be recompiled from source if we have a new Kernel 
Image being installed;
  3- You didn't patched your system kernel to either 
"linux-modules-5.4.0-67-generic" or "linux-modules-5.4.0-1039-aws". You are 
running an earlier kernel version, which can be from 5.3.0-XXX or 5.4.0-60 for 
example and you need to upgrade your system to the latest 'kernel-image' 
release.

  ALTERNATIVELY:
  1- When creating a new OS image based on the Bionic and it's already running 
kernel 5.4.0-67 or 5.4.0-1030-aws, you decide to install 'spl-dkms' and 
'zfs-dkms'. 

  SCENARIO:
  We have systems running on Xenial, Bionic, and Focal Fossa releases. We also 
have the 'Unattended Upgrades' on and we use ZFS.

  Every time that Ubuntu releases kernel updates, they are updated on
  our systems and both the new kernel and its modules will be active on
  the next reboot. Or if you run your workload on AWS and uses Canonical
  images, this is happening also on the image tagged as
  'ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20210224' on
  AWS Ireland / eu-west-1 Region (AMI ID = ami-0d75330b9efa7072d) for
  fresh installs.

  Some days ago, Ubuntu releases a new Linux Kernel update for the 5.4.0
  release, and this is happening on both "linux-
  modules-5.4.0-67-generic" and "linux-modules-5.4.0-1039-aws" (this
  might be true on GKE, Azure and other public cloud providers kernel
  releases, but I didn't test them).

  I have noticed that both EC2 instances that were rebooted after the
  kernel upgrade or if we are baking a new OS image based on the latest
  Canonical images I was unable to get the ZFS module loaded. The error
  message is showed below:

  $ sudo modprobe zfs
  modprobe: ERROR: could not insert 'zfs': Invalid argument

  When checking the 'dmesg', we can see the errors below (or even when
  booting up if you have the 'zfs-load-module.service' active on your
  system).

  [   51.358828] icp: disagrees about version of symbol __kstat_create
  [   51.358830] icp: Unknown symbol __kstat_create (err -22)
  [   51.358925] icp: disagrees about version of symbol __kstat_install
  [   51.358927] icp: Unknown symbol __kstat_install (err -22)
  [   79.196638] icp: disagrees about version of symbol __kstat_delete
  [   79.196643] icp: Unknown symbol __kstat_delete (err -22)
  [   79.196687] icp: disagrees about version of symbol __kstat_create
  [   79.196689] icp: Unknown symbol __kstat_create (err -22)
  [   79.196765] icp: disagrees about version of symbol __kstat_install
  [   79.196766] icp: Unknown symbol __kstat_install (err -22)

  On Bionic release, both th ZFS and the SPL modules comes with version
  '0.7.5'.

  $ dpkg -l |grep 0.7.5-1ubuntu16.9
  ii  libnvpair1linux                  0.7.5-1ubuntu16.9                   
amd64        Solaris name-value library for Linux
  ii  libuutil1linux                   0.7.5-1ubuntu16.9                   
amd64        Solaris userland utility library for Linux
  ii  libzfs2linux                     0.7.5-1ubuntu16.9                   
amd64        OpenZFS filesystem library for Linux
  ii  libzpool2linux                   0.7.5-1ubuntu16.9                   
amd64        OpenZFS pool library for Linux
  ii  zfs-dkms                         0.7.5-1ubuntu16.9                   all  
        OpenZFS filesystem kernel modules for Linux
  ii  zfs-initramfs                    0.7.5-1ubuntu16.9                   all  
        OpenZFS root filesystem capabilities for Linux - initramfs
  ii  zfs-zed                          0.7.5-1ubuntu16.9                   
amd64        OpenZFS Event Daemon
  ii  zfsutils-linux                   0.7.5-1ubuntu16.9                   
amd64        command-line tools to manage OpenZFS filesystems

  On systems running kernel 5.4.0 already have the modules compiled on
  version 0.8.1, which can vary depending on the active kernel.

  (test) wbraga@bastion-host-i-07c42966be34bd44e:~$ modinfo zfs |head -2
  filename:       /lib/modules/5.3.0-1030-aws/kernel/zfs/zfs.ko
  version:        0.8.1-1ubuntu14.4
  (test) wbraga@bastion-host-i-07c42966be34bd44e:~$ modinfo spl |head -2
  filename:       /lib/modules/5.3.0-1030-aws/kernel/zfs/spl.ko
  version:        0.8.1-1ubuntu14.4
  (test) wbraga@bastion-host-i-07c42966be34bd44e:~$ uname -sr
  Linux 5.3.0-1030-aws
  (test) wbraga@bastion-host-i-07c42966be34bd44e:~$ 

  From what I understood, both the SPL and the ZFS DKMS scripts would
  not replace the modules already installed on the systems that comes
  with 'linux-modules-5.3.0-XXXX' if they are equal or newer than the
  compiled one, as shown below:

  (...)
  zavl.ko:
  Running module version sanity check.
  Error! Module version 0.7.5-1ubuntu16.11 for zavl.ko
  is not newer than what is already found in kernel 5.4.0-64-generic 
(0.8.3-1ubuntu12.5).
  You may override by specifying --force.
  (...)

  However, some of the modules will be installed in another path (in
  this case, I am showing the modules that were installed):

  (...)
  splat.ko:
  Running module version sanity check.
   - Original module
     - No original module exists within this kernel
   - Installation
     - Installing to /lib/modules/5.4.0-67-generic/updates/dkms/
  (...)
  zpios.ko:
  Running module version sanity check.
   - Original module
     - No original module exists within this kernel
   - Installation
     - Installing to /lib/modules/5.4.0-67-generic/updates/dkms/
  (...)
  icp.ko:
  Running module version sanity check.
   - Original module
   - Installation
     - Installing to /lib/modules/5.4.0-67-generic/updates/dkms/

  
  After all these three modules are installed, I cannot load ZFS. I would 
assume that the issue lies on the duplicated 'icp.ko' module under 
'/lib/modules/5.4.0-67-generic/updates/dkms/' directory.

  
  $ ls -l /lib/modules/5.4.0-67-generic/updates/dkms/
  total 632
  -rw-r--r-- 1 root root 317744 Mar 23 14:55 icp.ko
  -rw-r--r-- 1 root root 289680 Mar 23 14:52 splat.ko
  -rw-r--r-- 1 root root  36848 Mar 23 14:55 zpios.ko
  $

  It appears that newer kernels (such as '5.4.0-67-generic') have ZFS
  version '0.8.3-1ubuntu12.6', while on previous kernel release (such as
  5.4.0-64-generic) the ZFS version is '0.8.3-1ubuntu12.5'. Something
  may have changed that now confuses modprobe to properly load the
  kernel.

  A workaround solution for this issue is to remove both 'spl-dkms' and
  'zfs-dkms' packages. apt/dkg will remove the compiled modules and we
  can load the module back.

  
  STEPS TO REPRODUCE THE ISSUE:

  1- Fire it up a Bionic machine. You can test it out on Vagrant for example. 
Ensure the kernel is something one or two versions below.
  2- Bump your system to kernel 5.4.0-60 (for example). You can achieve this by 
running:

  sudo apt install linux-image-5.4.0-60-generic linux-
  headers-5.4.0-60-generic linux-modules-5.4.0-60-generic

  3- Reboot your system
  4- Install ZFS packages, including both DKMS.

  sudo apt install libnvpair1linux libuutil1linux libzfs2linux
  libzpool2linux spl spl-dkms zfs-dkms zfs-initramfs zfs-zed zfsutils-
  linux.

  5- Install a new kernel release, such as 5.4.0-64. Check whether spl-
  dkms and zfs-dkms are triggered.

  sudo apt install linux-image-5.4.0-64-generic linux-
  headers-5.4.0-64-generic linux-modules-5.4.0-64-generic

  6- reboot (you will be in in kernel 5.4.0-64)

  7- Check whether you can load zfs with 'sudo modprobe zfs'

  8- Now install kernel 5.4.0-67.

  sudo apt install linux-image-5.4.0-67-generic linux-
  headers-5.4.0-67-generic linux-modules-5.4.0-67-generic.

  9- Reboot your system. You will be in Kernel '5.4.0-67'. The ZFS
  module won't come up (check 'dmesg').

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1920956/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to