I also found that if I run artful VM on the same hypervisor, I do not
see this problem on artful kernel.

So, from what I see, it seems to be a problem on kernel 4.10 that is
being exposed better on a KVM that runs on kernel 4.13.

This is better log I was able to capture:

[   32.029274] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[swapper/10:0]
[   32.029402] Modules linked in: vmx_crypto kvm ib_iser rdma_cm iw_cm ib_cm 
ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 
btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx 
xor raid6_pq libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth 
crc32c_vpmsum virtio_blk
[   32.029464] CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.10.0-383-generic 
#383
[   32.029466] task: c0000007f7663c00 task.stack: c0000007fe158000
[   32.029467] NIP: c0000000000163c4 LR: c0000000000163c4 CTR: 0000000000000006
[   32.029467] REGS: c0000007fe15b590 TRAP: 0901   Not tainted  
(4.10.0-383-generic)
[   32.029468] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>
[   32.029472]   CR: 28002824  XER: 00000000
[   32.029472] CFAR: c0000000001894e4 SOFTE: 1 
               GPR00: c00000000018995c c0000007fe15b810 c0000000014ac900 
0000000000000900 
               GPR04: c00000000010bc30 c0000007ffda2980 8000000000000000 
c0000007fe15b8c8 
               GPR08: 0000000000000000 0000000000000600 0000000000000100 
c0000007ffd8e128 
               GPR12: c0000000009b8450 c000000007b85a00 
[   32.029483] NIP [c0000000000163c4] arch_local_irq_restore+0x74/0x90
[   32.029484] LR [c0000000000163c4] arch_local_irq_restore+0x74/0x90
[   32.029485] Call Trace:
[   32.029486] [c0000007fe15b810] [00000000fffee2c1] 0xfffee2c1 (unreliable)
[   32.029488] [c0000007fe15b830] [c00000000018995c] expire_timers+0x13c/0x210
[   32.029490] [c0000007fe15b8a0] [c000000000189bd8] 
run_timer_softirq+0x1a8/0x230
[   32.029492] [c0000007fe15b940] [c000000000bae11c] __do_softirq+0x19c/0x3fc
[   32.029494] [c0000007fe15ba30] [c0000000000f28c8] irq_exit+0xe8/0x120
[   32.029496] [c0000007fe15ba50] [c0000000000250d4] timer_interrupt+0xa4/0xe0
[   32.029498] [c0000007fe15ba80] [c0000000000090d4] 
decrementer_common+0x114/0x120
[   32.029501] --- interrupt: 901 at plpar_hcall_norets+0x1c/0x28
                   LR = check_and_cede_processor+0x38/0x50
[   32.029502] [c0000007fe15bd70] [c0000000009b80d4] 
check_and_cede_processor+0x24/0x50 (unreliable)
[   32.029504] [c0000007fe15bdd0] [c0000000009b84a4] shared_cede_loop+0x54/0x150
[   32.029506] [c0000007fe15be00] [c0000000009b53ec] 
cpuidle_enter_state+0x17c/0x450
[   32.029507] [c0000007fe15be60] [c0000000001519a0] call_cpuidle+0x50/0xa0
[   32.029509] [c0000007fe15be80] [c000000000151ec0] do_idle+0x2d0/0x340
[   32.029510] [c0000007fe15bf00] [c0000000001521c0] cpu_startup_entry+0x40/0x60
[   32.029512] [c0000007fe15bf30] [c000000000047200] start_secondary+0x340/0x390
[   32.029514] [c0000007fe15bf90] [c00000000000aa6c] 
start_secondary_prolog+0x10/0x14
[   32.029515] Instruction dump:
[   32.029517] 994d01d2 2fa30000 409e0024 e92d0020 61298000 7d210164 38210020 
e8010010 
[   32.029520] 7c0803a6 4e800020 60420000 4bff42e5 <60000000> 4bffffe4 60420000 
e92d0020 
[   34.073274] INFO: rcu_sched detected stalls on CPUs/tasks:
[   34.073318]  10-...: (8 GPs behind) idle=ab9/1/0 softirq=703/703 fqs=2514 
[   34.073352]  (detected by 13, t=5252 jiffies, g=-67, c=-68, q=6813)
[   34.073386] Task dump for CPU 10:
[   34.073387] swapper/10      R  running task        0     0      1 0x00000804
[   34.073389] Call Trace:
[   34.073391] [c0000007fe15bb20] [0000000000000001] 0x1 (unreliable)
[   97.093266] INFO: rcu_sched detected stalls on CPUs/tasks:
[   97.093310]  10-...: (8 GPs behind) idle=ab9/1/0 softirq=703/703 fqs=10197 
[   97.093342]  (detected by 13, t=21007 jiffies, g=-67, c=-68, q=169213)
[   97.093378] Task dump for CPU 10:
[   97.093379] swapper/10      R  running task        0     0      1 0x00000804
[   97.093382] Call Trace:
[   97.093384] [c0000007fe15bb20] [0000000000000001] 0x1 (unreliable)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1733864

Title:
  kernel 4.10.0-40 is hanging with a CPU soft lock

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1733864/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to