Public bug reported:

For reference, here is the stack of systemd-udevd seen in the hang:

[ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds.
[ 1558.214318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1558.214556] systemd-udevd   D 00003fff8dbdf7a0     0  1778      1 0x00040000
[ 1558.214637] Call Trace:
[ 1558.214673] [c000000004ad3790] [c0000000007aac20] 
schedule_timeout+0x180/0x2f0 (unreliable)
[ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_to+0x200/0x350
[ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+0x414/0x9e0
[ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] 
blk_mq_freeze_queue_wait+0x64/0xd0
[ 1558.215107] [c000000004ad3af0] [d000000034011964] 
nvme_revalidate_disk+0xd4/0x3a0 [nvme]
[ 1558.215386] [c000000004ad3b90] [c0000000003c2398] 
rescan_partitions+0x98/0x390
[ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] 
__blkdev_reread_part+0x9c/0xd0
[ 1558.215599] [c000000004ad3c90] [c0000000003bb818] 
blkdev_reread_part+0x38/0x70
[ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_ioctl+0x3b4/0xb80
[ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+0x70/0x90
[ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_ioctl+0x458/0x740
[ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0
[ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_call+0x38/0xb4

It appears that systemd-udevd is triggering every time HTX writes to the
boot sector (partition table) of the raw drive, and this is causing the
revalidate calls which expose the issue with the block driver mq freeze.
With a partition table on each drive, HTX will no longer be writing the
partition table and no longer triggering systemd to re-read the
partition table and try to freeze I/O.

The fix for this is provided by the following upstream commit:

966d2b0 percpu-refcount: fix reference leak during percpu-atomic
transition

which needs to be pulled into 16.04 (as well as newer releases).

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-148242 severity-critical 
targetmilestone-inin16042

** Tags added: architecture-ppc64le bugnameltc-148242 severity-critical
targetmilestone-inin16042

** Changed in: ubuntu
     Assignee: (unassigned) => Taco Screen team (taco-screen-team)

** Package changed: ubuntu => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1662673

Title:
  systemd-udevd hung in blk_mq_freeze_queue_wait testing unpartitioned
  NVMe drive

Status in linux package in Ubuntu:
  New

Bug description:
  For reference, here is the stack of systemd-udevd seen in the hang:

  [ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 
seconds.
  [ 1558.214318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 1558.214556] systemd-udevd   D 00003fff8dbdf7a0     0  1778      1 
0x00040000
  [ 1558.214637] Call Trace:
  [ 1558.214673] [c000000004ad3790] [c0000000007aac20] 
schedule_timeout+0x180/0x2f0 (unreliable)
  [ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_to+0x200/0x350
  [ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+0x414/0x9e0
  [ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] 
blk_mq_freeze_queue_wait+0x64/0xd0
  [ 1558.215107] [c000000004ad3af0] [d000000034011964] 
nvme_revalidate_disk+0xd4/0x3a0 [nvme]
  [ 1558.215386] [c000000004ad3b90] [c0000000003c2398] 
rescan_partitions+0x98/0x390
  [ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] 
__blkdev_reread_part+0x9c/0xd0
  [ 1558.215599] [c000000004ad3c90] [c0000000003bb818] 
blkdev_reread_part+0x38/0x70
  [ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_ioctl+0x3b4/0xb80
  [ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+0x70/0x90
  [ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_ioctl+0x458/0x740
  [ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0
  [ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_call+0x38/0xb4

  It appears that systemd-udevd is triggering every time HTX writes to
  the boot sector (partition table) of the raw drive, and this is
  causing the revalidate calls which expose the issue with the block
  driver mq freeze. With a partition table on each drive, HTX will no
  longer be writing the partition table and no longer triggering systemd
  to re-read the partition table and try to freeze I/O.

  The fix for this is provided by the following upstream commit:

  966d2b0 percpu-refcount: fix reference leak during percpu-atomic
  transition

  which needs to be pulled into 16.04 (as well as newer releases).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662673/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to