[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
--- Comment From iranna.an...@in.ibm.com 2020-02-28 04:11 EDT--- Thanks! Closing the bug from IBM side. ** Tags removed: targetmilestone-inin--- ** Tags added: targetmilestone-inin2004 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
Thx Iranna for the reply and positive feedback. With that I'm changing the status to Fix Released. ** Changed in: libvirt (Ubuntu) Status: Invalid => Fix Released ** Changed in: ubuntu-power-systems Status: Invalid => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
Closing out after discussing with Michael Ranweiler. Please re-open if required. Thanks. ** Changed in: ubuntu-power-systems Status: Incomplete => Invalid ** Changed in: linux (Ubuntu) Status: Incomplete => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
As we believe that this is working as expected, we're lowering the priority to "low". ** Changed in: ubuntu-power-systems Importance: Medium => Low -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
Following Rafael's comment, we believe this is working as expected. Marking as incomplete while awaiting IBM's response. ** Changed in: linux (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
** Changed in: ubuntu-power-systems Status: Triaged => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
Manoj, This is a soft lockup in schedule() - a task is waiting I/O completion - inside a KVM guest. The guest kernel has a completion that will be awaken, re-scheduling the task back to a CPU run queue, as soon as the I/O is finished (the I/O usually contains a handle that, as soon as I/O is confirmed to be sent by transport layer, calls a callback function to "release" the completion and let the application to be re-scheduled. Depending on how the cache is configured, this logic MIGHT also check for I/O being committed in I/O server, only allowing the task to continue its logic after that: That is also considered as a soft lockup (tasks keeps re- scheduling itself until the I/o is done). The guest has just panic'ed because it had "panic on hung" configured. It is highly probable that the "issue" here is I/O contention causing the lockup inside the Guest, nothing else. There isn't any I/O timeouts - because bad transport and/or block device layer - or any other hard lockup due to a dead lock, for example. So, unless something else, undocumented in this bug, is happening, there is not much to be done without more information. To help kernel team, it would be good for IBM to provide more information on what was being done on the HOST, how the I/O devices are configured in KVM guest, etc. Thats my 5 cents. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
comment #4 suggested that the panic only happened because kernel.softlockup_panic was set to 1, was that verified? -- You received this bug notification because you are a member of Ubuntu Server, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
comment #4 suggested that the panic only happened because kernel.softlockup_panic was set to 1, was that verified? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
Adding the kernel track to this bug after checking with Michael (IBM). Although the bug was found in a flavor kernel of Ubuntu, it sounds like (from the notes in the description) that this might a generic issue with 18.04.3 and newer kernels. The next steps for IBM is to reproduce this issue on 18.04.3 or newer released kernel. Also, compare with the upstream kernel to see if issue was already fixed. You could use the mainline builds to compare https://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/ ** Changed in: libvirt (Ubuntu) Status: Triaged => Invalid ** Changed in: ubuntu-power-systems Assignee: Canonical Server Team (canonical-server) => Canonical Kernel Team (canonical-kernel-team) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
** Also affects: linux Importance: Undecided Status: New ** No longer affects: linux ** Also affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
** No longer affects: ubuntu-z-systems ** Also affects: ubuntu-power-systems Importance: Undecided Status: New ** Changed in: ubuntu-power-systems Status: New => Triaged ** Changed in: ubuntu-power-systems Importance: Undecided => Medium ** Changed in: ubuntu-power-systems Assignee: (unassigned) => Canonical Server Team (canonical-server) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
** Changed in: ubuntu-z-systems Assignee: Canonical Foundations Team (canonical-foundations) => Canonical Server Team (canonical-server) ** Changed in: ubuntu-z-systems Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
Based on stack trace: [ 1692.658756] Call Trace: [ 1692.658762] [c00020739ba9b970] [24008842] 0x24008842 (unreliable) [ 1692.658769] [c00020739ba9bb48] [c001c270] __switch_to+0x2a0/0x4d0 [ 1692.658774] [c00020739ba9bba8] [c0d048a4] __schedule+0x2a4/0xb00 [ 1692.658777] [c00020739ba9bc78] [c0d05140] schedule+0x40/0xc0 [ 1692.658781] [c00020739ba9bc98] [c0537bf4] jbd2_log_wait_commit+0xf4/0x1b0 [ 1692.658784] [c00020739ba9bd18] [c04c5ee4] ext4_sync_file+0x354/0x620 [ 1692.658788] [c00020739ba9bd78] [c042afb8] vfs_fsync_range+0x78/0x170 [ 1692.658790] [c00020739ba9bdc8] [c042b138] do_fsync+0x58/0xd0 [ 1692.658792] [c00020739ba9be08] [c042b528] SyS_fsync+0x28/0x40 [ 1692.658795] [c00020739ba9be28] [c000b284] system_call+0x58/0x6c [ 1692.658839] Kernel panic - not syncing: hung_task: blocked tasks [ 1692.659238] CPU: 48 PID: 785 Comm: khungtaskd Not tainted 4.15.0-1017.19-bz175922-ibm-gt #bz175922 [ 1692.659835] Call Trace: [ 1692.660025] [c08fd0eefbf8] [c0cea13c] dump_stack+0xb0/0xf4 (unreliable) [ 1692.660564] [c08fd0eefc38] [c0110020] panic+0x148/0x328 [ 1692.661004] [c08fd0eefcd8] [c0233a08] watchdog+0x2c8/0x420 [ 1692.661429] [c08fd0eefdb8] [c0140068] kthread+0x1a8/0x1b0 [ 1692.661881] [c08fd0eefe28] [c000b654] ret_from_kernel_thread+0x5c/0x88 [ 1692.662439] Sending IPI to other CPUs [ 1693.971250] IPI complete This IPI being sent to all other CPUs suggest that you preempted them by a NMI, in order to stop execution and, likely, call panic() for a dump. If that is true, that can be configured by sysctl variables: kernel.hardlockup_panic = 0 -> THIS, for HARD lockups kernel.hung_task_panic = 0 -> THIS, for SCHEDULING dead locks kernel.panic = 0 kernel.panic_on_io_nmi = 0 kernel.panic_on_oops = 1 kernel.panic_on_rcu_stall = 0 kernel.panic_on_unrecovered_nmi = 0 kernel.panic_on_warn = 0 kernel.panic_print = 0 kernel.softlockup_panic = 0 -> THIS, for SOFT lockups kernel.unknown_nmi_panic = 0 vm.panic_on_oom = 0 -> THIS for OOM issues And the panic would not happen for live virsh dumps (the live dump is likely causing delays in the VM and causing the pagecache to be fully dirtied, so the I/Os can't be commit as fast as the pages are being dirtied). Checking the sosreport you sent: $ cat sos_commands/kernel/sysctl_-a | grep -i panic kernel.hardlockup_panic = 0 kernel.hung_task_panic = 1 kernel.panic = 1 kernel.panic_on_oops = 1 kernel.panic_on_rcu_stall = 0 kernel.panic_on_warn = 0 kernel.softlockup_panic = 1 vm.panic_on_oom = 0 You have kernel.softlockup_panic = 1, this is what is causing the panic whenever the guest is having too much "steal time" to catch up with its needs (causing the lockups to happen). Am I missing something ? ** Changed in: libvirt (Ubuntu) Status: New => Triaged ** Changed in: libvirt (Ubuntu) Importance: Undecided => Low -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1846237] Re: Kernel Panic while virsh dump of Guest with 300G RAM is triggered.
** Also affects: ubuntu-z-systems Importance: Undecided Status: New ** Changed in: ubuntu-z-systems Assignee: (unassigned) => Canonical Foundations Team (canonical-foundations) ** Changed in: ubuntu-z-systems Importance: Undecided => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1846237 Title: Kernel Panic while virsh dump of Guest with 300G RAM is triggered. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1846237/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs