This bug is hitting for me on 16.04 LTS running kernel 4.13.0-16. udev is stuck in the following stack:
[<ffffffff906309eb>] blk_mq_freeze_queue_wait+0x4b/0xb0 [<ffffffff90631f4a>] blk_mq_freeze_queue+0x1a/0x20 [<ffffffffc03d676a>] __nvme_revalidate_disk+0x7a/0x3f0 [nvme_core] [<ffffffffc03d7bc3>] nvme_revalidate_disk+0x53/0x90 [nvme_core] [<ffffffff9063b72d>] rescan_partitions+0x8d/0x330 [<ffffffff906374f5>] __blkdev_reread_part+0x65/0x70 [<ffffffff90637523>] blkdev_reread_part+0x23/0x40 [<ffffffff90637ef7>] blkdev_ioctl+0x387/0x910 [<ffffffff9049253d>] block_ioctl+0x3d/0x50 [<ffffffff90467521>] do_vfs_ioctl+0xa1/0x5f0 [<ffffffff90467ae9>] SyS_ioctl+0x79/0x90 [<ffffffff90b0edfb>] entry_SYSCALL_64_fastpath+0x1e/0xa9 [<ffffffffffffffff>] 0xffffffffffffffff And the process info: 4 D root 797 1 0 80 0 - 11661 blk_mq 03:04 ? 00:00:02 /lib/systemd/systemd-udevd We have a bunch of read-only parted jobs backing up behind the kernel hang (and possibly causing it in the first place): root 17317 1 0 03:17 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print root 36839 36832 0 05:39 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print root 37181 37143 0 05:50 ? 00:00:00 /sbin/blockdev --getsize64 /dev/nvme0n1 root 37340 37333 0 06:00 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print root 38585 38549 0 08:29 ? 00:00:00 /sbin/blockdev --getsize64 /dev/nvme0n1 root 38742 38735 0 08:39 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print root 40022 39986 0 11:14 ? 00:00:00 /sbin/blockdev --getsize64 /dev/nvme0n1 root 40184 40177 0 11:24 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print root 41456 41419 0 13:59 ? 00:00:00 /sbin/blockdev --getsize64 /dev/nvme0n1 root 41615 41608 0 14:09 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print root 42905 42869 0 16:44 ? 00:00:00 /sbin/blockdev --getsize64 /dev/nvme0n1 root 43062 43054 0 16:54 ? 00:00:00 /sbin/parted.rw -m -s -- /dev/nvme0n1 unit B print These are NVME drives with a GPT and two partitions. Let me know if you need more info. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1662673 Title: systemd-udevd hung in blk_mq_freeze_queue_wait testing unpartitioned NVMe drive Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Fix Released Status in linux source package in Yakkety: Fix Released Status in linux source package in Zesty: Fix Released Bug description: For reference, here is the stack of systemd-udevd seen in the hang: [ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds. [ 1558.214318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1558.214556] systemd-udevd D 00003fff8dbdf7a0 0 1778 1 0x00040000 [ 1558.214637] Call Trace: [ 1558.214673] [c000000004ad3790] [c0000000007aac20] schedule_timeout+0x180/0x2f0 (unreliable) [ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_to+0x200/0x350 [ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+0x414/0x9e0 [ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] blk_mq_freeze_queue_wait+0x64/0xd0 [ 1558.215107] [c000000004ad3af0] [d000000034011964] nvme_revalidate_disk+0xd4/0x3a0 [nvme] [ 1558.215386] [c000000004ad3b90] [c0000000003c2398] rescan_partitions+0x98/0x390 [ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] __blkdev_reread_part+0x9c/0xd0 [ 1558.215599] [c000000004ad3c90] [c0000000003bb818] blkdev_reread_part+0x38/0x70 [ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_ioctl+0x3b4/0xb80 [ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+0x70/0x90 [ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_ioctl+0x458/0x740 [ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0 [ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_call+0x38/0xb4 It appears that systemd-udevd is triggering every time HTX writes to the boot sector (partition table) of the raw drive, and this is causing the revalidate calls which expose the issue with the block driver mq freeze. With a partition table on each drive, HTX will no longer be writing the partition table and no longer triggering systemd to re-read the partition table and try to freeze I/O. The fix for this is provided by the following upstream commit: 966d2b0 percpu-refcount: fix reference leak during percpu-atomic transition which needs to be pulled into 16.04 (as well as newer releases). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662673/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

