[Bug 1620317] [NEW] ISST-LTE:pNV: system ben is hung during ST (nvme)

bugproxy Mon, 05 Sep 2016 06:46:46 -0700

Public bug reported:

On when we are running I/O intensive tasks and CPU addition/removal, the
block may hang stalling the entire machine.


The backtrace below is one of the symptoms:

[12747.111149] ---[ end trace b4d8d720952460b5 ]---
[12747.126885] Trying to free IRQ 357 from IRQ context!
[12747.146930] ------------[ cut here ]------------
[12747.166674] WARNING: at 
/build/linux-iLHNl3/linux-4.4.0/kernel/irq/manage.c:1438
[12747.184069] Modules linked in: minix nls_iso8859_1 rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) 
iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) 
mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) mlx4_en(OE) 
mlx4_core(OE) binfmt_misc xfs joydev input_leds mac_hid ofpart cmdlinepart 
powernv_flash ipmi_powernv mtd ipmi_msghandler at24 opal_prd powernv_rng 
ibmpowernv uio_pdrv_genirq uio sunrpc knem(OE) autofs4 btrfs xor raid6_pq 
hid_generic usbhid hid uas usb_storage nouveau ast bnx2x i2c_algo_bit ttm 
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx5_core(OE) ahci 
drm mdio libcrc32c mlx_compat(OE) libahci vxlan nvme ip6_udp_tunnel udp_tunnel
[12747.349013] CPU: 80 PID: 0 Comm: swapper/80 Tainted: G        W  OEL  
4.4.0-21-generic #37-Ubuntu
[12747.369046] task: c000000f1fab89b0 ti: c000000f1fb6c000 task.ti: 
c000000f1fb6c000
[12747.404848] NIP: c000000000131888 LR: c000000000131884 CTR: 00000000300303f0
[12747.808333] REGS: c000000f1fb6e550 TRAP: 0700   Tainted: G        W  OEL   
(4.4.0-21-generic)
[12747.867658] MSR: 9000000100029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28022222  
XER: 20000000
[12747.884783] CFAR: c000000000aea8f4 SOFTE: 1 
GPR00: c000000000131884 c000000f1fb6e7d0 c0000000015b4200 0000000000000028 
GPR04: c000000f2a409c50 c000000f2a41b4e0 0000000f29480000 00000000000033da 
GPR08: 0000000000000007 c000000000f8b27c 0000000f29480000 9000000100001003 
GPR12: 0000000000002200 c000000007b6f800 c000000f2a40a938 0000000000000100 
GPR16: c000000f11480000 0000000000003a98 0000000000000000 0000000000000000 
GPR20: 0000000000000000 d000000009521008 d0000000095146a0 fffffffffffff000 
GPR24: c000000004a19ef0 0000000000000000 0000000000000003 000000000000007d 
GPR28: 0000000000000165 c000000eefeb1800 c000000eef830600 0000000000000165 
[12748.243270] NIP [c000000000131888] __free_irq+0x238/0x370
[12748.254089] LR [c000000000131884] __free_irq+0x234/0x370
[12748.269738] Call Trace:
[12748.286740] [c000000f1fb6e7d0] [c000000000131884] __free_irq+0x234/0x370 
(unreliable)
[12748.289687] [c000000f1fb6e860] [c000000000131af8] free_irq+0x88/0xb0
[12748.304594] [c000000f1fb6e890] [d000000009514528] 
nvme_suspend_queue+0xc8/0x150 [nvme]
[12748.333825] [c000000f1fb6e8c0] [d00000000951681c] 
nvme_dev_disable+0x3fc/0x400 [nvme]
[12748.340913] [c000000f1fb6e9a0] [d000000009516ae4] nvme_timeout+0xe4/0x260 
[nvme]
[12748.357136] [c000000f1fb6ea60] [c000000000548a34] 
blk_mq_rq_timed_out+0x64/0x110
[12748.383939] [c000000f1fb6ead0] [c00000000054c540] bt_for_each+0x160/0x170
[12748.399292] [c000000f1fb6eb40] [c00000000054d4e8] 
blk_mq_queue_tag_busy_iter+0x78/0x110
[12748.402665] [c000000f1fb6eb90] [c000000000547358] blk_mq_rq_timer+0x48/0x140
[12748.438649] [c000000f1fb6ebd0] [c00000000014a13c] call_timer_fn+0x5c/0x1c0
[12748.468126] [c000000f1fb6ec60] [c00000000014a5fc] 
run_timer_softirq+0x31c/0x3f0
[12748.483367] [c000000f1fb6ed30] [c0000000000beb78] __do_softirq+0x188/0x3e0
[12748.498378] [c000000f1fb6ee20] [c0000000000bf048] irq_exit+0xc8/0x100
[12748.501048] [c000000f1fb6ee40] [c00000000001f954] timer_interrupt+0xa4/0xe0
[12748.516377] [c000000f1fb6ee70] [c000000000002714] 
decrementer_common+0x114/0x180
[12748.547282] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
[12748.547282]     LR = arch_local_irq_restore+0x74/0x90
[12748.574141] [c000000f1fb6f160] [0000000000000001] 0x1 (unreliable)
[12748.592405] [c000000f1fb6f180] [c000000000aedc3c] dump_stack+0xd0/0xf0
[12748.596461] [c000000f1fb6f1c0] [c0000000001006fc] dequeue_task_idle+0x5c/0x90
[12748.611532] [c000000f1fb6f230] [c0000000000f6080] deactivate_task+0xc0/0x130
[12748.627685] [c000000f1fb6f270] [c000000000adcb10] __schedule+0x440/0x990
[12748.654416] [c000000f1fb6f300] [c000000000add0a8] schedule+0x48/0xc0
[12748.670558] [c000000f1fb6f330] [c000000000ae1474] 
schedule_timeout+0x274/0x350
[12748.673485] [c000000f1fb6f420] [c000000000ade23c] wait_for_common+0xec/0x240
[12748.699192] [c000000f1fb6f4a0] [c0000000000e6908] kthread_stop+0x88/0x210
[12748.718385] [c000000f1fb6f4e0] [d000000009514240] 
nvme_dev_list_remove+0x90/0x110 [nvme]
[12748.748925] [c000000f1fb6f510] [d000000009516498] 
nvme_dev_disable+0x78/0x400 [nvme]
[12748.752112] [c000000f1fb6f5f0] [d000000009516ae4] nvme_timeout+0xe4/0x260 
[nvme]
[12748.775395] [c000000f1fb6f6b0] [c000000000548a34] 
blk_mq_rq_timed_out+0x64/0x110
[12748.821069] [c000000f1fb6f720] [c00000000054c540] bt_for_each+0x160/0x170
[12748.851733] [c000000f1fb6f790] [c00000000054d4e8] 
blk_mq_queue_tag_busy_iter+0x78/0x110
[12748.883093] [c000000f1fb6f7e0] [c000000000547358] blk_mq_rq_timer+0x48/0x140
[12748.918348] [c000000f1fb6f820] [c00000000014a13c] call_timer_fn+0x5c/0x1c0
[12748.934743] [c000000f1fb6f8b0] [c00000000014a5fc] 
run_timer_softirq+0x31c/0x3f0
[12748.938084] [c000000f1fb6f980] [c0000000000beb78] __do_softirq+0x188/0x3e0
[12748.960815] [c000000f1fb6fa70] [c0000000000bf048] irq_exit+0xc8/0x100
[12748.992175] [c000000f1fb6fa90] [c00000000001f954] timer_interrupt+0xa4/0xe0
[12749.019299] [c000000f1fb6fac0] [c000000000002714] 
decrementer_common+0x114/0x180
[12749.037168] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
[12749.037168]     LR = arch_local_irq_restore+0x74/0x90
[12749.079044] [c000000f1fb6fdb0] [c000000f2a41d680] 0xc000000f2a41d680 
(unreliable)
[12749.081736] [c000000f1fb6fdd0] [c000000000909a28] 
cpuidle_enter_state+0x1a8/0x410
[12749.127094] [c000000f1fb6fe30] [c000000000119a88] call_cpuidle+0x78/0xd0
[12749.144435] [c000000f1fb6fe70] [c000000000119e5c] 
cpu_startup_entry+0x37c/0x480
[12749.166156] [c000000f1fb6ff30] [c00000000004563c] start_secondary+0x33c/0x360
[12749.186929] [c000000f1fb6ff90] [c000000000008b6c] 
start_secondary_prolog+0x10/0x14
[12749.223828] Instruction dump:
[12749.223856] 4e800020 4bf83a5d 60000000 4bffff64 4bf83a51 60000000 4bffffa8 
3c62ff7b 
[12749.233245] 7f84e378 38630fe0 489b900d 60000000 <0fe00000> 4bfffe20 7d2903a6 
387d0118 
[12749.298371] ---[ end trace b4d8d720952460b6 ]---

== Comment: #184 - Gabriel Krisman Bertazi <gbert...@br.ibm.com> - 2016-07-29 
12:55:48 ==
I got it figured out.  The nvme driver is not playing nice with the block 
timeout infrastructure, in a way that the timeout code goes into a live lock, 
waiting for the queue to be released.  CPU hotplug, on the other hand, who is 
holding the queue freeze lock at the time, is waiting for an outstanding 
request to timeout (or complete).  This request, in turn is stuck in the 
device, requiring a reset triggered by a timeout, which never happens due to 
the live lock.

I don't have the reason why the request is stuck inside the device
requiring a timeout, but this could even be caused by the Leaf firmware
itself. I also see some successful timeouts  triggered under normal
conditions.  In the failure event, we should be able to abort the
request normally, but this happens via the timeout infrastructure, which
is blocked during cpu hotplug events.

I have a quirk to fully recover after the failure, by forcing a reset of
the stucked IO, which allows the cpu hotplug completion and block layer
recovery.  I have a machine hitting the failure every few minutes in a
loop, and recovering from it with my patch.

Patch submitted to linux-block

https://marc.info/?l=linux-block&m=146976739016592&w=2

== Comment: #207 - Gabriel Krisman Bertazi <gbert...@br.ibm.com> - 2016-09-05 
09:13:51 ==
Canonical,

This is fixed by:

e57690fe009b ("blk-mq: don't overwrite rq->mq_ctx")
0e87e58bf60e ("blk-mq: improve warning for running a queue on the wrong CPU")
71f79fb3179e (" blk-mq: Allow timeouts to run while queue is freezing")

Which will apply cleanly on top of your kernel.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: Confirmed


** Tags: architecture-ppc64le bugnameltc-140718 severity-critical 
targetmilestone-inin16041

** Tags added: architecture-ppc64le bugnameltc-140718 severity-critical
targetmilestone-inin16041

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1620317

Title:
  ISST-LTE:pNV: system ben is hung during ST (nvme)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1620317/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1620317] [NEW] ISST-LTE:pNV: system ben is hung during ST (nvme)

Reply via email to