------- Comment From [email protected] 2016-08-22 13:11 EDT------- (In reply to comment #16) > Test kernel at http://people.canonical.com/~rtg/eeh-lp1602724/ with upstream > commit c21377f8366c95440d533edbe47d070f662c62ef ('nvme: Suspend all queues > before deletion') applied.
This test kernel is not ok, it stalls the wq: [ 540.097661] INFO: rcu_sched detected stalls on CPUs/tasks: [ 540.103320] 1-...: (1 GPs behind) idle=d35/140000000000000/0 softirq=2385/2386 fqs=65 [ 540.103411] (detected by 11, t=5472 jiffies, g=1335, c=1334, q=793) [ 540.103492] Task dump for CPU 1: [ 540.103539] kworker/u32:1 D 0000000000000000 0 101 0 0x00000800 [ 540.103656] Call Trace: [ 540.103692] [c00000017bc539c0] [c00000017bc53a00] 0xc00000017bc53a00 (unreliable) [ 540.103805] [c00000017bc53a00] [d000000001614480] nvme_suspend_queue+0x30/0x150 [nvme] [ 540.103914] [c00000017bc53a30] [d000000001616850] nvme_dev_disable+0x110/0x440 [nvme] [ 540.104022] [c00000017bc53b10] [d000000001617e60] nvme_reset_work+0xe0/0x1120 [nvme] [ 540.104132] [c00000017bc53c50] [c0000000000dd630] process_one_work+0x1e0/0x5a0 [ 540.104239] [c00000017bc53ce0] [c0000000000ddb84] worker_thread+0x194/0x680 [ 540.104331] [c00000017bc53d80] [c0000000000e6680] kthread+0x110/0x130 [ 540.104424] [c00000017bc53e30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4 [ 604.094501] INFO: rcu_sched detected stalls on CPUs/tasks: [ 604.094699] 1-...: (1 GPs behind) idle=d35/140000000000000/0 softirq=2385/2386 fqs=82 [ 604.094700] (detected by 5, t=21472 jiffies, g=1335, c=1334, q=1283) [ 604.094705] Task dump for CPU 1: Can you provide the backported patch for verification? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1602724 Title: Ubuntu 16.04 - Full EEH Recovery Support for NVMe devices Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: In Progress Bug description: == Comment: #0 - Heitor Ricardo Alves de Siqueira <[email protected]> - 2016-07-12 12:54:27 == Current nvme driver in Ubuntu 16.04 kernel does not handle error recovery; we are missing some patches from the upstream nvme driver. We would like to ask Canonical to cherry pick the following patches for the 16.04 kernel, if possible: * 9396dec916c0 ("nvme: use a work item to submit async event requests") * 79f2b358c9ba ("nvme: don't poll the CQ from the kthread") * 2d55cd5f511d ("nvme: replace the kthread with a per-device watchdog timer") * 9bf2b972afea ("NVMe: Fix reset/remove race") * c875a7093f04 ("nvme: Avoid reset work on watchdog timer function during error recovery") * a5229050b69c ("NVMe: Always use MSI/MSI-x interrupts") To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1602724/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

