zfs-test test suite on xenial
-----------------------------

Build/install the latest zfs-test suite from zfs.git:

$ git clone https://github.com/zfsonlinux/zfs.git
$ cd zfs
$ git log --oneline -1
c81f179 Metaslab max_size should be persisted while unloaded

$ sudo apt-get build-dep zfs-linux
$ sudo apt-get install libssl-dev alien libpython3-dev python3-setuptools 
python3-cffi

$ ./autogen.sh
$ ./configure
$ make -j$(nproc) pkg-utils
$ sudo dpkg -i zfs-test_0.8.0-170_amd64.deb

$ /usr/share/zfs/zfs-tests.sh  # hit kernel errors in the tests below,
removed them from 'linux.run'

        mixed_create_failure
                [  390.511557] VERIFY(!RW_LOCK_HELD(&l->l_rwlock)) failed       
                                                     
                [  390.516303] PANIC at zap.c:395:zap_leaf_pageout()            
                                                                   


        zfs_clone_deeply_nested << too.
                very long stack traces detected in scheduler.

        zfs_upgrade_007_neg << hung tasks.
                shows twice in the file?!
                [ 1715.988411] VERIFY3(newds == os->os_dsl_dataset) failed 
(ffff8800b4e2a000 == ffff880064b83000)
                [ 1715.995255] PANIC at 
dmu_objset.c:618:dmu_objset_refresh_ownership()

        zpool_create_024_pos
                [  572.926873] BUG: unable to handle kernel NULL pointer 
dereference at           (null)

        import_cachefile_device_added
                [  524.079638] PANIC: blkptr at ffff88009e9f8048 DVA 1 has 
invalid VDEV 1

        zpool_upgrade_007_pos (not zfs_upgrade_007_neg above)
                [  480.801356] VERIFY3(newds == os->os_dsl_dataset) failed 
(ffff880203a98000 == ffff880203f75000)                    
                [  480.808916] PANIC at 
dmu_objset.c:618:dmu_objset_refresh_ownership()

        enospc_003_pos
                [ 1201.448045] INFO: task txg_sync:1353 blocked for more than 
120 seconds.
                [ 1201.456365]       Tainted: P           OE   
4.4.0-159-generic #187-Ubuntu
                [ 1201.459650] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
                [ 1201.464831] INFO: task file_write:1385 blocked for more than 
120 seconds.
                [ 1201.468134]       Tainted: P           OE   
4.4.0-159-generic #187-Ubuntu
                [ 1201.471387] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.

RUN_FILE="/usr/share/zfs/runfiles/linux.run"
TMP_RUN_FILE="/tmp/$(basename $RUN_FILE)"
cp $RUN_FILE $TMP_RUN_FILE

for TEST in \
  mixed_create_failure \
  zfs_clone_deeply_nested \
  zfs_upgrade_007_neg \
  zpool_create_024_pos \
  import_cachefile_device_added \
  zpool_upgrade_007_pos \
  enospc_003_pos
do
  sed \
    -e "s:'$TEST',::" \
    -e "s:\( ,\)\?'$TEST'::" \
    -i $TMP_RUN_FILE
done

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1839521

Title:
  Xenial: ZFS deadlock in shrinker path with xattrs

Status in zfs-linux package in Ubuntu:
  Invalid
Status in zfs-linux source package in Xenial:
  In Progress
Status in zfs-linux source package in Bionic:
  Invalid
Status in zfs-linux source package in Disco:
  Invalid
Status in zfs-linux source package in Eoan:
  Invalid

Bug description:
  [Impact]

   * Xenial's ZFS can deadlock in the memory shrinker path
     after removing files with extended attributes (xattr).

   * Extended attributes are enabled by default, but are
     _not_ used by default, which reduces the likelyhood.

   * It's very difficult/rare to reproduce this problem,
     due to file/xattr/remove/shrinker/lru order/timing
     circumstances required. (weeks for a reporter user)
     but a synthetic test-case has been found for tests.

  [Test Case]

   * A synthetic reproducer is available for this LP,
     with a few steps to touch/setfattr/rm/drop_caches
     plus a kernel module to massage the disposal list.

   * In the original ZFS module:
     the xattr dir inode is not purged immediately on
     file removal, but possibly purged _two_ shrinker
     invocations later.  This allows for other thread
     started before file remove to call zfs_zget() on
     the xattr child inode and iput() it, so it makes
     to the same disposal list as the xattr dir inode.

   * In the modified ZFS module:
     the xattr dir inode is purged immediately on file
     removal not possibly later on shrinker invocation,
     so the problem window above doesn't exist anymore.

  [Regression Potential]

   * Low. The patches are confined to extended attributes
     in ZFS, specifically node removal/purge, and another
     change how an xattr child inode tracks its xattr dir
     (parent) inode, so that it can be purged immediately
     on removal.

   * The ZFS test-suite has been run on original/modified
     zfs-dkms package/kernel modules, with no regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1839521/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to