[Kernel-packages] [Bug 1620317] Re: ISST-LTE:pNV: system ben is hung during ST (nvme)

Kamal Mostafa Mon, 19 Sep 2016 07:58:06 -0700

** Changed in: linux (Ubuntu Xenial)
       Status: In Progress => Fix Committed


-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1620317

Title:
  ISST-LTE:pNV: system ben is hung during ST (nvme)

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Yakkety:
  Fix Released

Bug description:
  On when we are running I/O intensive tasks and CPU addition/removal,
  the block may hang stalling the entire machine.

  The backtrace below is one of the symptoms:

  [12747.111149] ---[ end trace b4d8d720952460b5 ]---
  [12747.126885] Trying to free IRQ 357 from IRQ context!
  [12747.146930] ------------[ cut here ]------------
  [12747.166674] WARNING: at 
/build/linux-iLHNl3/linux-4.4.0/kernel/irq/manage.c:1438
  [12747.184069] Modules linked in: minix nls_iso8859_1 rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) 
iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) 
mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) mlx4_en(OE) 
mlx4_core(OE) binfmt_misc xfs joydev input_leds mac_hid ofpart cmdlinepart 
powernv_flash ipmi_powernv mtd ipmi_msghandler at24 opal_prd powernv_rng 
ibmpowernv uio_pdrv_genirq uio sunrpc knem(OE) autofs4 btrfs xor raid6_pq 
hid_generic usbhid hid uas usb_storage nouveau ast bnx2x i2c_algo_bit ttm 
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx5_core(OE) ahci 
drm mdio libcrc32c mlx_compat(OE) libahci vxlan nvme ip6_udp_tunnel udp_tunnel
  [12747.349013] CPU: 80 PID: 0 Comm: swapper/80 Tainted: G        W  OEL  
4.4.0-21-generic #37-Ubuntu
  [12747.369046] task: c000000f1fab89b0 ti: c000000f1fb6c000 task.ti: 
c000000f1fb6c000
  [12747.404848] NIP: c000000000131888 LR: c000000000131884 CTR: 
00000000300303f0
  [12747.808333] REGS: c000000f1fb6e550 TRAP: 0700   Tainted: G        W  OEL   
(4.4.0-21-generic)
  [12747.867658] MSR: 9000000100029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28022222  
XER: 20000000
  [12747.884783] CFAR: c000000000aea8f4 SOFTE: 1 
  GPR00: c000000000131884 c000000f1fb6e7d0 c0000000015b4200 0000000000000028 
  GPR04: c000000f2a409c50 c000000f2a41b4e0 0000000f29480000 00000000000033da 
  GPR08: 0000000000000007 c000000000f8b27c 0000000f29480000 9000000100001003 
  GPR12: 0000000000002200 c000000007b6f800 c000000f2a40a938 0000000000000100 
  GPR16: c000000f11480000 0000000000003a98 0000000000000000 0000000000000000 
  GPR20: 0000000000000000 d000000009521008 d0000000095146a0 fffffffffffff000 
  GPR24: c000000004a19ef0 0000000000000000 0000000000000003 000000000000007d 
  GPR28: 0000000000000165 c000000eefeb1800 c000000eef830600 0000000000000165 
  [12748.243270] NIP [c000000000131888] __free_irq+0x238/0x370
  [12748.254089] LR [c000000000131884] __free_irq+0x234/0x370
  [12748.269738] Call Trace:
  [12748.286740] [c000000f1fb6e7d0] [c000000000131884] __free_irq+0x234/0x370 
(unreliable)
  [12748.289687] [c000000f1fb6e860] [c000000000131af8] free_irq+0x88/0xb0
  [12748.304594] [c000000f1fb6e890] [d000000009514528] 
nvme_suspend_queue+0xc8/0x150 [nvme]
  [12748.333825] [c000000f1fb6e8c0] [d00000000951681c] 
nvme_dev_disable+0x3fc/0x400 [nvme]
  [12748.340913] [c000000f1fb6e9a0] [d000000009516ae4] nvme_timeout+0xe4/0x260 
[nvme]
  [12748.357136] [c000000f1fb6ea60] [c000000000548a34] 
blk_mq_rq_timed_out+0x64/0x110
  [12748.383939] [c000000f1fb6ead0] [c00000000054c540] bt_for_each+0x160/0x170
  [12748.399292] [c000000f1fb6eb40] [c00000000054d4e8] 
blk_mq_queue_tag_busy_iter+0x78/0x110
  [12748.402665] [c000000f1fb6eb90] [c000000000547358] 
blk_mq_rq_timer+0x48/0x140
  [12748.438649] [c000000f1fb6ebd0] [c00000000014a13c] call_timer_fn+0x5c/0x1c0
  [12748.468126] [c000000f1fb6ec60] [c00000000014a5fc] 
run_timer_softirq+0x31c/0x3f0
  [12748.483367] [c000000f1fb6ed30] [c0000000000beb78] __do_softirq+0x188/0x3e0
  [12748.498378] [c000000f1fb6ee20] [c0000000000bf048] irq_exit+0xc8/0x100
  [12748.501048] [c000000f1fb6ee40] [c00000000001f954] timer_interrupt+0xa4/0xe0
  [12748.516377] [c000000f1fb6ee70] [c000000000002714] 
decrementer_common+0x114/0x180
  [12748.547282] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
  [12748.547282]     LR = arch_local_irq_restore+0x74/0x90
  [12748.574141] [c000000f1fb6f160] [0000000000000001] 0x1 (unreliable)
  [12748.592405] [c000000f1fb6f180] [c000000000aedc3c] dump_stack+0xd0/0xf0
  [12748.596461] [c000000f1fb6f1c0] [c0000000001006fc] 
dequeue_task_idle+0x5c/0x90
  [12748.611532] [c000000f1fb6f230] [c0000000000f6080] 
deactivate_task+0xc0/0x130
  [12748.627685] [c000000f1fb6f270] [c000000000adcb10] __schedule+0x440/0x990
  [12748.654416] [c000000f1fb6f300] [c000000000add0a8] schedule+0x48/0xc0
  [12748.670558] [c000000f1fb6f330] [c000000000ae1474] 
schedule_timeout+0x274/0x350
  [12748.673485] [c000000f1fb6f420] [c000000000ade23c] 
wait_for_common+0xec/0x240
  [12748.699192] [c000000f1fb6f4a0] [c0000000000e6908] kthread_stop+0x88/0x210
  [12748.718385] [c000000f1fb6f4e0] [d000000009514240] 
nvme_dev_list_remove+0x90/0x110 [nvme]
  [12748.748925] [c000000f1fb6f510] [d000000009516498] 
nvme_dev_disable+0x78/0x400 [nvme]
  [12748.752112] [c000000f1fb6f5f0] [d000000009516ae4] nvme_timeout+0xe4/0x260 
[nvme]
  [12748.775395] [c000000f1fb6f6b0] [c000000000548a34] 
blk_mq_rq_timed_out+0x64/0x110
  [12748.821069] [c000000f1fb6f720] [c00000000054c540] bt_for_each+0x160/0x170
  [12748.851733] [c000000f1fb6f790] [c00000000054d4e8] 
blk_mq_queue_tag_busy_iter+0x78/0x110
  [12748.883093] [c000000f1fb6f7e0] [c000000000547358] 
blk_mq_rq_timer+0x48/0x140
  [12748.918348] [c000000f1fb6f820] [c00000000014a13c] call_timer_fn+0x5c/0x1c0
  [12748.934743] [c000000f1fb6f8b0] [c00000000014a5fc] 
run_timer_softirq+0x31c/0x3f0
  [12748.938084] [c000000f1fb6f980] [c0000000000beb78] __do_softirq+0x188/0x3e0
  [12748.960815] [c000000f1fb6fa70] [c0000000000bf048] irq_exit+0xc8/0x100
  [12748.992175] [c000000f1fb6fa90] [c00000000001f954] timer_interrupt+0xa4/0xe0
  [12749.019299] [c000000f1fb6fac0] [c000000000002714] 
decrementer_common+0x114/0x180
  [12749.037168] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
  [12749.037168]     LR = arch_local_irq_restore+0x74/0x90
  [12749.079044] [c000000f1fb6fdb0] [c000000f2a41d680] 0xc000000f2a41d680 
(unreliable)
  [12749.081736] [c000000f1fb6fdd0] [c000000000909a28] 
cpuidle_enter_state+0x1a8/0x410
  [12749.127094] [c000000f1fb6fe30] [c000000000119a88] call_cpuidle+0x78/0xd0
  [12749.144435] [c000000f1fb6fe70] [c000000000119e5c] 
cpu_startup_entry+0x37c/0x480
  [12749.166156] [c000000f1fb6ff30] [c00000000004563c] 
start_secondary+0x33c/0x360
  [12749.186929] [c000000f1fb6ff90] [c000000000008b6c] 
start_secondary_prolog+0x10/0x14
  [12749.223828] Instruction dump:
  [12749.223856] 4e800020 4bf83a5d 60000000 4bffff64 4bf83a51 60000000 4bffffa8 
3c62ff7b 
  [12749.233245] 7f84e378 38630fe0 489b900d 60000000 <0fe00000> 4bfffe20 
7d2903a6 387d0118 
  [12749.298371] ---[ end trace b4d8d720952460b6 ]---

  == Comment: #184 - Gabriel Krisman Bertazi <gbert...@br.ibm.com> - 2016-07-29 
12:55:48 ==
  I got it figured out.  The nvme driver is not playing nice with the block 
timeout infrastructure, in a way that the timeout code goes into a live lock, 
waiting for the queue to be released.  CPU hotplug, on the other hand, who is 
holding the queue freeze lock at the time, is waiting for an outstanding 
request to timeout (or complete).  This request, in turn is stuck in the 
device, requiring a reset triggered by a timeout, which never happens due to 
the live lock.

  I don't have the reason why the request is stuck inside the device
  requiring a timeout, but this could even be caused by the Leaf
  firmware itself. I also see some successful timeouts  triggered under
  normal conditions.  In the failure event, we should be able to abort
  the request normally, but this happens via the timeout infrastructure,
  which is blocked during cpu hotplug events.

  I have a quirk to fully recover after the failure, by forcing a reset
  of the stucked IO, which allows the cpu hotplug completion and block
  layer recovery.  I have a machine hitting the failure every few
  minutes in a loop, and recovering from it with my patch.

  Patch submitted to linux-block

  https://marc.info/?l=linux-block&m=146976739016592&w=2

  == Comment: #207 - Gabriel Krisman Bertazi <gbert...@br.ibm.com> - 2016-09-05 
09:13:51 ==
  Canonical,

  This is fixed by:

  e57690fe009b ("blk-mq: don't overwrite rq->mq_ctx")
  0e87e58bf60e ("blk-mq: improve warning for running a queue on the wrong CPU")
  71f79fb3179e (" blk-mq: Allow timeouts to run while queue is freezing")

  Which will apply cleanly on top of your kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1620317/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1620317] Re: ISST-LTE:pNV: system ben is hung during ST (nvme)

Reply via email to