Public bug reported:

== Comment: #0 - SEETEENA THOUFEEK <[email protected]> - 2024-08-09 03:50:24 
==
+++ This bug was initially created as a clone of Bug #206737 +++

---Problem Description---
L2 Guest migration: evelp2g4[L2]: while running NFS guest migration 
continuously dumping smp_call_function_many_cond+0x500/0x738 (unreliable) and 
watchdog: BUG: soft lockup - CPU#14 stuck for 223s! [systemd-homed}
 
---uname output---
NA
 
Machine Type = NA 
 
Contact Information = NA

[79205.163691] Hardware name: IBM pSeries (emulated by qemu) POWER10 (raw) 
0x800200 0xf000006 of:SLOF,HEAD hv:linux,kvm pSeries
[79205.163834] NIP:  c0000000002bb7a4 LR: c0000000002bb750 CTR: c0000000000d192c
[79205.163929] REGS: c0000003871cf1b0 TRAP: 0900   Tainted: G             L     
 
[79205.165041] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 
44042222  XER: 20040004
[79205.165266] CFAR: 0000000000000000 IRQMASK: 0
               GPR00: c0000000002bbc58 c0000003871cf450 c0000000020ded00 
0000000000000009
               GPR04: 0000000000000009 0000000000000009 0000000000000080 
0000000000000200
               GPR08: 00000000000001ff 0000000000000001 c000000740f57ee0 
0000000044048222
               GPR12: c0000000000d192c c000000743ddc980 0000000000000000 
0000000000000000
               GPR16: 0000000000000000 c00000000d86e200 0000000000000001 
0000000000000001
               GPR20: 000000000000000c c000000003d06188 c0000000000ac4d0 
c00000000a374e00
               GPR24: c000000003d06840 0000000000000000 c000000741193188 
c000000741193188
               GPR28: c000000741193180 c000000003d06840 0000000000000048 
0000000000000009
[79205.171660] NIP [c0000000002bb7a4] smp_call_function_many_cond+0x1e0/0x738
[79205.171752] LR [c0000000002bb750] smp_call_function_many_cond+0x18c/0x738
[79205.171835] Call Trace:
[79205.171869] [c0000003871cf450] [c0000000002bbc58] 
smp_call_function_many_cond+0x694/0x738 (unreliable)
[79205.171986] [c0000003871cf520] [c0000000000ac4d0] radix__tlb_flush+0x4c/0x140
[79205.173636] [c0000003871cf560] [c00000000052e900] tlb_finish_mmu+0x130/0x1f0
[79205.173754] [c0000003871cf590] [c00000000052a280] exit_mmap+0x1cc/0x574
[79205.173848] [c0000003871cf6c0] [c00000000016ec9c] __mmput+0x54/0x1d4
[79205.173939] [c0000003871cf6f0] [c0000000006385c4] begin_new_exec+0x6dc/0xefc
[79205.174037] [c0000003871cf780] [c0000000006edea8] 
load_elf_binary+0x4c8/0x1a50
[79205.174136] [c0000003871cf880] [c0000000006361c8] bprm_execve+0x2b4/0x7a0
[79205.174219] [c0000003871cf950] [c000000000637988] 
do_execveat_common+0x1c0/0x2d8
[79205.174316] [c0000003871cf9f0] [c000000000638e38] sys_execve+0x54/0x6c
[79205.174399] [c0000003871cfa20] [c00000000002fec8] 
system_call_exception+0x168/0x310
[79205.174497] [c0000003871cfe50] [c00000000000d05c] 
system_call_vectored_common+0x15c/0x2ec
[79205.176245] --- interrupt: 3000 at 0x7fff95b10b08
[79205.176326] NIP:  00007fff95b10b08 LR: 00007fff95b10b08 CTR: 0000000000000000
[79205.176438] REGS: c0000003871cfe80 TRAP: 3000   Tainted: G             L     
 (
[79205.176558] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 
48044424  XER: 00000000
[79205.176686] IRQMASK: 0
               GPR00: 000000000000000b 00007fffe6919aa0 00007fff95c47c00 
0000000152598c80
               GPR04: 00007fffe6919bf8 00000001525db6e0 ffffffffffffffff 
00007fffe6919a20
               GPR08: 0000000152598c88 0000000000000000 0000000000000000 
0000000000000000
               GPR12: 0000000000000000 00007fff969a4220 0000000152585570 
0000000000000000
               GPR16: 00007fffe6919c48 0000000000000570 0000000152598c80 
0000000000000000
               GPR20: 0000000000000000 0000000000009998 000000015259a450 
0000000152586460
               GPR24: 00000001525bca90 00007fffe6919e48 0000000000000000 
00000001525db6e0
               GPR28: 0000000117e98448 00000001525d0b00 0000000000000000 
0000000000100000
[79205.177505] NIP [00007fff95b10b08] 0x7fff95b10b08
[79205.177578] LR [00007fff95b10b08] 0x7fff95b10b08
[79205.177649] --- interrupt: 3000


Steps to reproduce: Install the  build on NFS storage  guest kernel 6.8.10-300 

Start the HTX workload - mdt.less

Start the NFS guest migration between the L2 hosts.

Sourece L2 host : evelp2 
Target L2 host  : rinlp1

migration command : virsh migrate --live  --domain $vm_name
qemu+ssh://$target_host/system --verbose --undefinesource --persistent
--timeout 120

Share the same NFS storage between two hosts [here /kvm_pool]  
10.33.4.52:/kvm_pool           nfs4      650G  304G  347G  47% /kvm_pool

Test running : HTX

Guest state : up

-------------------------------------------------------------------------------------
--------------------------------------

L2 guest Config:

(1) Problem on  Guest:   evelp2g4

(2) PHYP/ Processor Type:  KVM/P10/Everest

(3) Rootvg Filesystem: EXT4


(5) Network Bridge: Macvtap

(6) IO Disk Type/Driver: qemu-img/ qcow2

(7) Install Disk Type: Single

-------------------------------------------------------------------------------------
--------------------------------------

L1 host details :

MDC mode : off

(1) PHYP/ Processor Type:  KVM/P10/Everest

(2) CEC Name: evelp2

(3) Rootvg Filesystem: xfs


(5) Network Interface: Dedicated Network

(6) IO Type: NVME


(8) Multipath Enabled: no

(9) Install Disk Type: Single

(10) MMU: RPT


The kernel patches are at
https://lore.kernel.org/kvm/[email protected]/T/#t

Qemu patches are at
https://lore.kernel.org/qemu-devel/171760304518.1127.12881297254648658843.stgit@ad1b393f0e09/

powerpc/topic/ppc-kvm.

[1/8] KVM: PPC: Book3S HV: Fix the set_one_reg for MMCR3
https://git.kernel.org/powerpc/c/f9ca6a10be20479d526f27316cc32cfd1785ed39
[2/8] KVM: PPC: Book3S HV: Fix the get_one_reg of SDAR
https://git.kernel.org/powerpc/c/009f6f42c67e9de737d6d3d199f92b21a8cb9622
[3/8] KVM: PPC: Book3S HV: Add one-reg interface for DEXCR register
https://git.kernel.org/powerpc/c/1a1e6865f516696adcf6e94f286c7a0f84d78df3
[4/8] KVM: PPC: Book3S HV nestedv2: Keep nested guest DEXCR in sync
https://git.kernel.org/powerpc/c/2d6be3ca3276ab30fb14f285d400461a718d45e7
[5/8] KVM: PPC: Book3S HV: Add one-reg interface for HASHKEYR register
https://git.kernel.org/powerpc/c/e9eb790b25577a15d3f450ed585c59048e4e6c44
[6/8] KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHKEYR in sync
https://git.kernel.org/powerpc/c/1e97c1eb785fe2dc863c2bd570030d6fcf4b5e5b
[7/8] KVM: PPC: Book3S HV: Add one-reg interface for HASHPKEYR register
https://git.kernel.org/powerpc/c/9a0d2f4995ddde3022c54e43f9ece4f71f76f6e8
[8/8] KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHPKEYR in sync
https://git.kernel.org/powerpc/c/0b65365f3fa95c2c5e2094739151a05cabb3c48a

** Affects: kernel-package (Ubuntu)
     Importance: Undecided
     Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
         Status: New


** Tags: architecture-ppc64le bugnameltc-208511 severity-critical 
targetmilestone-inin2404

** Tags added: architecture-ppc64le bugnameltc-208511 severity-critical
targetmilestone-inin2404

** Changed in: ubuntu
     Assignee: (unassigned) => Ubuntu on IBM Power Systems Bug Triage 
(ubuntu-power-triage)

** Package changed: ubuntu => kernel-package (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2076406

Title:
  ISST-LTE:KOP:1060FW:evelp2 :L2 Guest migration: evelp2g4[L2]: while
  running NFS guest migration  continuously  dumping
  smp_call_function_many_cond+0x500/0x738 (unreliable) and watchdog:
  BUG: soft lockup - CPU#14 stuck for 223s! [systemd-homed} (Fedora)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/kernel-package/+bug/2076406/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to