Hello Colin, nope.

It looks like ZFS has some sort of contention in waiting the ZIL to
finish the async callbacks from scheduled work, even when more spindles
exist. Not sure if its related to slab merging or getting the objects
from either slab/kmalloc or directly from page allocs. I think its a
locking contention due to zfs scalability iself, to be honest :\. I'm
getting, sometimes, soft lockups here and there (all waiting on on zfs
sync - txg_wait_synced). I'm using a raid-z with 4x480G SSDs and an
extra SSD as L2ARC disk.

Its fairly easy to reproduce, i'm basically compiling kernel tress all
day in this machine, using the zfs pools as ccache's cache and as the
compiled objects repository. I haven't explored tuning it though,
considering its cache is not kept together with the pagecache (like
changing zfs_dirty_data_sync or zfs_sync_taskq_batch_pct to make zfs
cache flush less async, which might help).

Cheers!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1654517

Title:
  ZFS I/O hangs for minutes

Status in Native ZFS for Linux:
  Fix Released
Status in zfs-linux package in Ubuntu:
  Incomplete

Bug description:
  I/O for multiple programs, like `thunderbird`, `firefox`, etc., hangs
  for minutes and approx. 100 `z_rd_int_[n]` and `z_wr_int_[n]` kernel
  threads are created, `dmesg` contains

      [ 9184.451606] INFO: task txg_sync:11471 blocked for more than 120 
seconds.
      [ 9184.451610]       Tainted: P           OE   4.8.0-32-generic #34-Ubuntu
      [ 9184.451611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
      [ 9184.451612] txg_sync        D ffffa240ab3a7aa8     0 11471      2 
0x00000000
      [ 9184.451616]  ffffa240ab3a7aa8 00ffffffbb6ade1f ffffa24095148000 
ffffa240e5ca5580
      [ 9184.451618]  0000000000000046 ffffa240ab3a8000 ffffa240ff359200 
7fffffffffffffff
      [ 9184.451620]  ffffa23d36cf9050 0000000000000001 ffffa240ab3a7ac0 
ffffffffbbe96b15
      [ 9184.451621] Call Trace:
      [ 9184.451627]  [<ffffffffbbe96b15>] schedule+0x35/0x80
      [ 9184.451628]  [<ffffffffbbe9a2ca>] schedule_timeout+0x22a/0x3f0
      [ 9184.451631]  [<ffffffffbb62c77e>] ? __switch_to+0x2ce/0x6c0
      [ 9184.451633]  [<ffffffffbb6c156c>] ? pick_next_task_fair+0x48c/0x4c0
      [ 9184.451635]  [<ffffffffbb6fc511>] ? ktime_get+0x41/0xb0
      [ 9184.451636]  [<ffffffffbbe96304>] io_schedule_timeout+0xa4/0x110
      [ 9184.451644]  [<ffffffffc0acee12>] cv_wait_common+0xb2/0x130 [spl]
      [ 9184.451646]  [<ffffffffbb6c7150>] ? wake_atomic_t_function+0x60/0x60
      [ 9184.451650]  [<ffffffffc0aceee8>] __cv_wait_io+0x18/0x20 [spl]
      [ 9184.451689]  [<ffffffffc0ebfcdd>] zio_wait+0xfd/0x1d0 [zfs]
      [ 9184.451716]  [<ffffffffc0e4b2c8>] dsl_pool_sync+0xb8/0x480 [zfs]
      [ 9184.451745]  [<ffffffffc0e6650f>] spa_sync+0x37f/0xb30 [zfs]
      [ 9184.451747]  [<ffffffffbb6aecd2>] ? default_wake_function+0x12/0x20
      [ 9184.451779]  [<ffffffffc0e77195>] txg_sync_thread+0x3a5/0x600 [zfs]
      [ 9184.451807]  [<ffffffffc0e76df0>] ? txg_delay+0x160/0x160 [zfs]
      [ 9184.451811]  [<ffffffffc0ac9fb1>] thread_generic_wrapper+0x71/0x80 
[spl]
      [ 9184.451815]  [<ffffffffc0ac9f40>] ? __thread_exit+0x20/0x20 [spl]
      [ 9184.451817]  [<ffffffffbb6a3e58>] kthread+0xd8/0xf0
      [ 9184.451819]  [<ffffffffbbe9b4df>] ret_from_fork+0x1f/0x40
      [ 9184.451821]  [<ffffffffbb6a3d80>] ? kthread_create_on_node+0x1e0/0x1e0
      [ 9184.451849] INFO: task mozStorage #2:21607 blocked for more than 120 
seconds.
      [ 9184.451851]       Tainted: P           OE   4.8.0-32-generic #34-Ubuntu
      [ 9184.451852] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
      [ 9184.451853] mozStorage #2   D ffffa23fe8a5bd38     0 21607  19750 
0x00000004
      [ 9184.451855]  ffffa23fe8a5bd38 00ffa240ee8feb40 ffffa240ecf72ac0 
ffffa2403803b900
      [ 9184.451857]  ffffffffbc2c02f7 ffffa23fe8a5c000 ffffa240aa940828 
ffffa240aa940800
      [ 9184.451858]  ffffa240aa940980 0000000000000000 ffffa23fe8a5bd50 
ffffffffbbe96b15
      [ 9184.451860] Call Trace:
      [ 9184.451861]  [<ffffffffbbe96b15>] schedule+0x35/0x80
      [ 9184.451866]  [<ffffffffc0acee70>] cv_wait_common+0x110/0x130 [spl]
      [ 9184.451868]  [<ffffffffbb6c7150>] ? wake_atomic_t_function+0x60/0x60
      [ 9184.451872]  [<ffffffffc0aceea5>] __cv_wait+0x15/0x20 [spl]
      [ 9184.451904]  [<ffffffffc0eb9759>] zil_commit.part.11+0x79/0x7a0 [zfs]
      [ 9184.451909]  [<ffffffffc0acf9f6>] ? tsd_hash_search.isra.0+0x46/0xa0 
[spl]
      [ 9184.451913]  [<ffffffffc0ad0214>] ? tsd_set+0x2b4/0x500 [spl]
      [ 9184.451914]  [<ffffffffbbe98a82>] ? mutex_lock+0x12/0x30
      [ 9184.451945]  [<ffffffffc0eb9e97>] zil_commit+0x17/0x20 [zfs]
      [ 9184.451975]  [<ffffffffc0eaf2aa>] zfs_fsync+0x7a/0xf0 [zfs]
      [ 9184.452005]  [<ffffffffc0ec5878>] zpl_fsync+0x68/0xa0 [zfs]
      [ 9184.452008]  [<ffffffffbb868b9b>] vfs_fsync_range+0x4b/0xb0
      [ 9184.452010]  [<ffffffffbb868c5d>] do_fsync+0x3d/0x70
      [ 9184.452011]  [<ffffffffbb868ef0>] SyS_fsync+0x10/0x20
      [ 9184.452013]  [<ffffffffbbe9b2b6>] entry_SYSCALL_64_fastpath+0x1e/0xa8
      [ 9184.452023] INFO: task bitcoin-msghand:663 blocked for more than 120 
seconds.
      [ 9184.452024]       Tainted: P           OE   4.8.0-32-generic #34-Ubuntu
      [ 9184.452025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
      [ 9184.452026] bitcoin-msghand D ffffa23eeb23bd38     0   663  26994 
0x00000000
      [ 9184.452028]  ffffa23eeb23bd38 00ffa23eab434000 ffffa240ecf70000 
ffffa24095148000
      [ 9184.452030]  ffffa23eeb23bd20 ffffa23eeb23c000 ffffa240aa940828 
ffffa240aa940800
      [ 9184.452031]  ffffa240aa940980 0000000000000000 ffffa23eeb23bd50 
ffffffffbbe96b15
      [ 9184.452033] Call Trace:
      [ 9184.452034]  [<ffffffffbbe96b15>] schedule+0x35/0x80
      [ 9184.452039]  [<ffffffffc0acee70>] cv_wait_common+0x110/0x130 [spl]
      [ 9184.452041]  [<ffffffffbb6c7150>] ? wake_atomic_t_function+0x60/0x60
      [ 9184.452044]  [<ffffffffc0aceea5>] __cv_wait+0x15/0x20 [spl]
      [ 9184.452074]  [<ffffffffc0eb9759>] zil_commit.part.11+0x79/0x7a0 [zfs]
      [ 9184.452079]  [<ffffffffc0acf9f6>] ? tsd_hash_search.isra.0+0x46/0xa0 
[spl]
      [ 9184.452083]  [<ffffffffc0ad0214>] ? tsd_set+0x2b4/0x500 [spl]
      [ 9184.452084]  [<ffffffffbbe98a82>] ? mutex_lock+0x12/0x30
      [ 9184.452113]  [<ffffffffc0eb9e97>] zil_commit+0x17/0x20 [zfs]
      [ 9184.452141]  [<ffffffffc0eaf2aa>] zfs_fsync+0x7a/0xf0 [zfs]
      [ 9184.452168]  [<ffffffffc0ec5878>] zpl_fsync+0x68/0xa0 [zfs]
      [ 9184.452170]  [<ffffffffbb868b9b>] vfs_fsync_range+0x4b/0xb0
      [ 9184.452172]  [<ffffffffbb868c5d>] do_fsync+0x3d/0x70
      [ 9184.452173]  [<ffffffffbb868f13>] SyS_fdatasync+0x13/0x20
      [ 9184.452174]  [<ffffffffbbe9b2b6>] entry_SYSCALL_64_fastpath+0x1e/0xa8
      [ 9305.221219] INFO: task txg_sync:11471 blocked for more than 120 
seconds.
      [ 9305.221223]       Tainted: P           OE   4.8.0-32-generic #34-Ubuntu
      [ 9305.221224] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
      [ 9305.221225] txg_sync        D ffffa240ab3a7aa8     0 11471      2 
0x00000000
      [ 9305.221229]  ffffa240ab3a7aa8 00ffffffbb6ade1f ffffa24095148000 
ffffa240e5ca5580
      [ 9305.221230]  0000000000000046 ffffa240ab3a8000 ffffa240ff359200 
7fffffffffffffff
      [ 9305.221232]  ffffa23d36cf9050 0000000000000001 ffffa240ab3a7ac0 
ffffffffbbe96b15
      [ 9305.221234] Call Trace:
      [ 9305.221239]  [<ffffffffbbe96b15>] schedule+0x35/0x80
      [ 9305.221241]  [<ffffffffbbe9a2ca>] schedule_timeout+0x22a/0x3f0
      [ 9305.221243]  [<ffffffffbb62c77e>] ? __switch_to+0x2ce/0x6c0
      [ 9305.221245]  [<ffffffffbb6c156c>] ? pick_next_task_fair+0x48c/0x4c0
      [ 9305.221247]  [<ffffffffbb6fc511>] ? ktime_get+0x41/0xb0
      [ 9305.221249]  [<ffffffffbbe96304>] io_schedule_timeout+0xa4/0x110
      [ 9305.221256]  [<ffffffffc0acee12>] cv_wait_common+0xb2/0x130 [spl]
      [ 9305.221258]  [<ffffffffbb6c7150>] ? wake_atomic_t_function+0x60/0x60
      [ 9305.221262]  [<ffffffffc0aceee8>] __cv_wait_io+0x18/0x20 [spl]
      [ 9305.221301]  [<ffffffffc0ebfcdd>] zio_wait+0xfd/0x1d0 [zfs]
      [ 9305.221328]  [<ffffffffc0e4b2c8>] dsl_pool_sync+0xb8/0x480 [zfs]
      [ 9305.221357]  [<ffffffffc0e6650f>] spa_sync+0x37f/0xb30 [zfs]
      [ 9305.221359]  [<ffffffffbb6aecd2>] ? default_wake_function+0x12/0x20
      [ 9305.221389]  [<ffffffffc0e77195>] txg_sync_thread+0x3a5/0x600 [zfs]
      [ 9305.221417]  [<ffffffffc0e76df0>] ? txg_delay+0x160/0x160 [zfs]
      [ 9305.221421]  [<ffffffffc0ac9fb1>] thread_generic_wrapper+0x71/0x80 
[spl]
      [ 9305.221424]  [<ffffffffc0ac9f40>] ? __thread_exit+0x20/0x20 [spl]
      [ 9305.221426]  [<ffffffffbb6a3e58>] kthread+0xd8/0xf0
      [ 9305.221428]  [<ffffffffbbe9b4df>] ret_from_fork+0x1f/0x40
      [ 9305.221430]  [<ffffffffbb6a3d80>] ? kthread_create_on_node+0x1e0/0x1e0
      [ 9305.221458] INFO: task mozStorage #2:21607 blocked for more than 120 
seconds.
      [ 9305.221459]       Tainted: P           OE   4.8.0-32-generic #34-Ubuntu
      [ 9305.221460] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
      [ 9305.221461] mozStorage #2   D ffffa23fe8a5bd38     0 21607  19750 
0x00000004
      [ 9305.221463]  ffffa23fe8a5bd38 00ffa240ee8feb40 ffffa240ecf72ac0 
ffffa2403803b900
      [ 9305.221465]  ffffffffbc2c02f7 ffffa23fe8a5c000 ffffa240aa940828 
ffffa240aa940800
      [ 9305.221466]  ffffa240aa940980 0000000000000000 ffffa23fe8a5bd50 
ffffffffbbe96b15
      [ 9305.221468] Call Trace:
      [ 9305.221469]  [<ffffffffbbe96b15>] schedule+0x35/0x80
      [ 9305.221474]  [<ffffffffc0acee70>] cv_wait_common+0x110/0x130 [spl]
      [ 9305.221475]  [<ffffffffbb6c7150>] ? wake_atomic_t_function+0x60/0x60
      [ 9305.221479]  [<ffffffffc0aceea5>] __cv_wait+0x15/0x20 [spl]
      [ 9305.221510]  [<ffffffffc0eb9759>] zil_commit.part.11+0x79/0x7a0 [zfs]
      [ 9305.221515]  [<ffffffffc0acf9f6>] ? tsd_hash_search.isra.0+0x46/0xa0 
[spl]
      [ 9305.221519]  [<ffffffffc0ad0214>] ? tsd_set+0x2b4/0x500 [spl]
      [ 9305.221520]  [<ffffffffbbe98a82>] ? mutex_lock+0x12/0x30
      [ 9305.221551]  [<ffffffffc0eb9e97>] zil_commit+0x17/0x20 [zfs]
      [ 9305.221582]  [<ffffffffc0eaf2aa>] zfs_fsync+0x7a/0xf0 [zfs]
      [ 9305.221611]  [<ffffffffc0ec5878>] zpl_fsync+0x68/0xa0 [zfs]
      [ 9305.221614]  [<ffffffffbb868b9b>] vfs_fsync_range+0x4b/0xb0
      [ 9305.221616]  [<ffffffffbb868c5d>] do_fsync+0x3d/0x70
      [ 9305.221618]  [<ffffffffbb868ef0>] SyS_fsync+0x10/0x20
      [ 9305.221619]  [<ffffffffbbe9b2b6>] entry_SYSCALL_64_fastpath+0x1e/0xa8
      [ 9305.221621] INFO: task mozStorage #8:5507 blocked for more than 120 
seconds.
      [ 9305.221622]       Tainted: P           OE   4.8.0-32-generic #34-Ubuntu
      [ 9305.221623] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
      [ 9305.221624] mozStorage #8   D ffffa23f1573fc18     0  5507  19750 
0x00000004
      [ 9305.221625]  ffffa23f1573fc18 00000179907cd000 ffffa240ecf72ac0 
ffffa23f944b9c80
      [ 9305.221627]  ffffa240ec957188 ffffa23f15740000 ffffa240ff319200 
7fffffffffffffff
      [ 9305.221629]  ffffa23d0ac9a5e0 0000000000000001 ffffa23f1573fc30 
ffffffffbbe96b15
      [ 9305.221630] Call Trace:
      [ 9305.221631]  [<ffffffffbbe96b15>] schedule+0x35/0x80
      [ 9305.221633]  [<ffffffffbbe9a2ca>] schedule_timeout+0x22a/0x3f0
      [ 9305.221662]  [<ffffffffc0ebdb50>] ? zio_taskq_member.isra.6+0x90/0x90 
[zfs]
      [ 9305.221666]  [<ffffffffc0aca9ae>] ? taskq_dispatch_ent+0x4e/0x120 [spl]
      [ 9305.221668]  [<ffffffffbb6fc511>] ? ktime_get+0x41/0xb0
      [ 9305.221669]  [<ffffffffbbe96304>] io_schedule_timeout+0xa4/0x110
      [ 9305.221673]  [<ffffffffc0acee12>] cv_wait_common+0xb2/0x130 [spl]
      [ 9305.221674]  [<ffffffffbb6c7150>] ? wake_atomic_t_function+0x60/0x60
      [ 9305.221678]  [<ffffffffc0aceee8>] __cv_wait_io+0x18/0x20 [spl]
      [ 9305.221706]  [<ffffffffc0ebfcdd>] zio_wait+0xfd/0x1d0 [zfs]
      [ 9305.221736]  [<ffffffffc0eb9aa6>] zil_commit.part.11+0x3c6/0x7a0 [zfs]
      [ 9305.221740]  [<ffffffffc0acf9f6>] ? tsd_hash_search.isra.0+0x46/0xa0 
[spl]
      [ 9305.221768]  [<ffffffffc0eb9e97>] zil_commit+0x17/0x20 [zfs]
      [ 9305.221798]  [<ffffffffc0eaf2aa>] zfs_fsync+0x7a/0xf0 [zfs]
      [ 9305.221829]  [<ffffffffc0ec5878>] zpl_fsync+0x68/0xa0 [zfs]
      [ 9305.221833]  [<ffffffffbb868b9b>] vfs_fsync_range+0x4b/0xb0
      [ 9305.221834]  [<ffffffffbb868c5d>] do_fsync+0x3d/0x70
      [ 9305.221836]  [<ffffffffbb868ef0>] SyS_fsync+0x10/0x20
      [ 9305.221838]  [<ffffffffbbe9b2b6>] entry_SYSCALL_64_fastpath+0x1e/0xa8

  ProblemType: Bug
  DistroRelease: Ubuntu 16.10
  Package: zfs-dkms 0.6.5.8-0ubuntu4.1
  ProcVersionSignature: Ubuntu 4.8.0-32.34-generic 4.8.11
  Uname: Linux 4.8.0-32-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  ApportVersion: 2.20.3-0ubuntu8.2
  Architecture: amd64
  CurrentDesktop: Unity
  Date: Fri Jan  6 10:19:22 2017
  InstallationDate: Installed on 2015-12-12 (390 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Release amd64 (20151021)
  SourcePackage: zfs-linux
  UpgradeStatus: Upgraded to yakkety on 2016-10-17 (80 days ago)
  ---
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 
k4.8.0-32-generic.
  ApportVersion: 2.20.3-0ubuntu8.2
  Architecture: amd64
  ArecordDevices:
   **** Liste der Hardware-Geräte (CAPTURE) ****
   Karte 0: PCH [HDA Intel PCH], Gerät 0: ALC269VC Analog [ALC269VC Analog]
     Sub-Geräte: 1/1
     Sub-Gerät #0: subdevice #0
  AudioDevicesInUse:
   BEN.        PID ZUGR.  BEFEHL
   /dev/snd/pcmC0D0p:   richter   20519 F...m pulseaudio
   /dev/snd/controlC0:  richter   20519 F.... pulseaudio
  Card0.Amixer.info:
   Card hw:0 'PCH'/'HDA Intel PCH at 0xd3610000 irq 31'
     Mixer name : 'Realtek ALC269VC'
     Components : 'HDA:10ec0269,17aac034,00100202 
HDA:80862806,80860101,00100000'
     Controls      : 30
     Simple ctrls  : 12
  CurrentDesktop: Unity
  DistroRelease: Ubuntu 16.10
  HibernationDevice: RESUME=UUID=1bb0bb3e-8148-4957-9673-9701bd571c0a
  InstallationDate: Installed on 2015-12-12 (390 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Release amd64 (20151021)
  MachineType: LENOVO 20221
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  Package: linux (not installed)
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.8.0-32-generic 
root=UUID=791b47e7-915d-4d5a-bce0-d7e9b660a2ab ro 
rootflags=subvol=ubuntu-main-root
  ProcVersionSignature: Ubuntu 4.8.0-32.34-generic 4.8.11
  RelatedPackageVersions:
   linux-restricted-modules-4.8.0-32-generic N/A
   linux-backports-modules-4.8.0-32-generic  N/A
   linux-firmware                            1.161.1
  Tags:  yakkety
  Uname: Linux 4.8.0-32-generic x86_64
  UpgradeStatus: Upgraded to yakkety on 2016-10-17 (80 days ago)
  UserGroups: adm autopilot bumblebee cdrom dip libvirtd lp lpadmin plugdev 
sambashare saned sudo vboxusers
  _MarkForUpload: True
  dmi.bios.date: 07/12/2013
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 71CN51WW(V1.21)
  dmi.board.asset.tag: No Asset Tag
  dmi.board.name: INVALID
  dmi.board.vendor: LENOVO
  dmi.board.version: 31900003WIN8 STD MLT
  dmi.chassis.asset.tag: No Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo IdeaPad Z500 Touch
  dmi.modalias: 
dmi:bvnLENOVO:bvr71CN51WW(V1.21):bd07/12/2013:svnLENOVO:pn20221:pvrLenovoIdeaPadZ500Touch:rvnLENOVO:rnINVALID:rvr31900003WIN8STDMLT:cvnLENOVO:ct10:cvrLenovoIdeaPadZ500Touch:
  dmi.product.name: 20221
  dmi.product.version: Lenovo IdeaPad Z500 Touch
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/zfs/+bug/1654517/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to