Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On 2013-05-12 18:52, Kashyap Chamarthy wrote: [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? L2 dmesg. $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. Indeed, you're right. I disabled NMI on L1, and rebooted the newly created L2 guest starts just fine. NMI watchdogs go via some perf counters theses days IIRC. Can anyone tell me which of those may be used in Kashyap's setup? I'm probably lacking them for my guests and therefore do not see the errors. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote: On 2013-05-12 18:52, Kashyap Chamarthy wrote: [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? L2 dmesg. $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. Indeed, you're right. I disabled NMI on L1, and rebooted the newly created L2 guest starts just fine. NMI watchdogs go via some perf counters theses days IIRC. Can anyone tell me which of those may be used in Kashyap's setup? I'm probably lacking them for my guests and therefore do not see the errors. Try running with -cpu host for L1. Your CPU definition probably lacks PMU leaf. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
-Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Gleb Natapov Sent: Monday, May 13, 2013 2:39 PM To: Jan Kiszka Cc: Kashyap Chamarthy; Abel Gordon; Nakajima, Jun; kvm@vger.kernel.org; kvm-ow...@vger.kernel.org Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted. On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote: On 2013-05-12 18:52, Kashyap Chamarthy wrote: [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? L2 dmesg. $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. Indeed, you're right. I disabled NMI on L1, and rebooted the newly created L2 guest starts just fine. NMI watchdogs go via some perf counters theses days IIRC. Can anyone tell me which of those may be used in Kashyap's setup? I'm probably lacking them for my guests and therefore do not see the errors. Try running with -cpu host for L1. Your CPU definition probably lacks PMU leaf. I met the same NMI issue in L2, too. L1: -cpu host (or -cpu Haswell,+vmx) L2: -cpu qemu64 by default If I use '-cpu qemu64,+vmx' to create L1, I'll not meet NMI issue in L2. Best Regards, Yongjie (Jay) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On 2013-05-13 08:45, Ren, Yongjie wrote: -Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Gleb Natapov Sent: Monday, May 13, 2013 2:39 PM To: Jan Kiszka Cc: Kashyap Chamarthy; Abel Gordon; Nakajima, Jun; kvm@vger.kernel.org; kvm-ow...@vger.kernel.org Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted. On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote: On 2013-05-12 18:52, Kashyap Chamarthy wrote: [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? L2 dmesg. $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. Indeed, you're right. I disabled NMI on L1, and rebooted the newly created L2 guest starts just fine. NMI watchdogs go via some perf counters theses days IIRC. Can anyone tell me which of those may be used in Kashyap's setup? I'm probably lacking them for my guests and therefore do not see the errors. Try running with -cpu host for L1. Your CPU definition probably lacks PMU leaf. I met the same NMI issue in L2, too. L1: -cpu host (or -cpu Haswell,+vmx) L2: -cpu qemu64 by default If I use '-cpu qemu64,+vmx' to create L1, I'll not meet NMI issue in L2. That, and it looks like my guest kernel was lacking CONFIG_LOCKUP_DETECTOR. Will rebuild and retest later. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Mon, May 13, 2013 at 08:57:32AM +0200, Jan Kiszka wrote: On 2013-05-13 08:45, Ren, Yongjie wrote: -Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Gleb Natapov Sent: Monday, May 13, 2013 2:39 PM To: Jan Kiszka Cc: Kashyap Chamarthy; Abel Gordon; Nakajima, Jun; kvm@vger.kernel.org; kvm-ow...@vger.kernel.org Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted. On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote: On 2013-05-12 18:52, Kashyap Chamarthy wrote: [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? L2 dmesg. $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. Indeed, you're right. I disabled NMI on L1, and rebooted the newly created L2 guest starts just fine. NMI watchdogs go via some perf counters theses days IIRC. Can anyone tell me which of those may be used in Kashyap's setup? I'm probably lacking them for my guests and therefore do not see the errors. Try running with -cpu host for L1. Your CPU definition probably lacks PMU leaf. I met the same NMI issue in L2, too. L1: -cpu host (or -cpu Haswell,+vmx) L2: -cpu qemu64 by default If I use '-cpu qemu64,+vmx' to create L1, I'll not meet NMI issue in L2. That, and it looks like my guest kernel was lacking CONFIG_LOCKUP_DETECTOR. Will rebuild and retest later. It looks like NMI injected by L0 to L1 are mistakenly injected into L2. Can you test this by injecting NMI into L1 via qemu monitor? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Fri, May 10, 2013 at 10:40:22AM -0700, Nakajima, Jun wrote: On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:39, Kashyap Chamarthy wrote: On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try it please. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try it please. I don't see the above SeaBIOS hang, however I'm able to consistently reproduce this stack trace when booting L1 guest: [2.516894] VFS: Cannot open root device mapper/fedora-root or unknown-block(0,0): error -6 [2.527636] Please append a correct root= boot option; here are the available partitions: [2.538792] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [2.539716] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1 [2.539716] Call Trace: [2.539716] [81649c19] panic+0xc1/0x1d0 [2.539716] [81d010e0] mount_block_root+0x1fa/0x2ac [2.539716] [81d011e9] mount_root+0x57/0x5b [2.539716] [81d0132a] prepare_namespace+0x13d/0x176 [2.539716] [81d00e1c] kernel_init_freeable+0x1cf/0x1da [2.539716] [81d00610] ? do_early_param+0x8c/0x8c [2.539716] [81637ca0] ? rest_init+0x80/0x80 [2.539716] [81637cae] kernel_init+0xe/0xf0 [2.539716] [8165bd6c] ret_from_fork+0x7c/0xb0 [2.539716] [81637ca0] ? rest_init+0x80/0x80 [2.539716] Uhhuh. NMI received for unknown reason 30 on CPU 1. [2.539716] Do you have a strange power saving mode enabled? [2.539716] Dazed and confused, but trying to continue [2.539716] Uhhuh. NMI received for unknown reason 20 on CPU 1. Howver, L1 boots just fine. When I try to boot L2, it throws this different stack trace. [176092.303585] lock(dev-device_lock); [176092.307947] [176092.307947] *** DEADLOCK *** [176092.307947] [176092.314943] 2 locks held by systemd/1: [176092.319283] #0: (misc_mtx){+.+.+.}, at: [814534b8] misc_open+0x28/0x1d0 [176092.328104] #1: (wdd-lock){+.+...}, at: [81557f22] watchdog_start+0x22/0x80 [176092.337532] [176092.337532] stack backtrace: [176092.342661] CPU: 1 PID: 1 Comm: systemd Not tainted 3.10.0-0.rc0.git23.1.fc20.x86_64 #1 [176092.351823] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013 [176092.366101] 8257d070 880241b1b9c0 81719128 880241b1ba00 [176092.374617] 81714d75 880241b1ba50 880241b80960 880241b8 [176092.383130] 0002 0002 880241b80960 880241b1bac0 [176092.391647] Call Trace: [176092.394514] [81719128] dump_stack+0x19/0x1b 2m OK ] Re[176092.400430] [81714d75] print_circular_bug+0x201/0x210 [176092.408898] [810db094] __lock_acquire+0x17c4/0x1b30 ached target Shu[176092.415602] [81720d7c] ? _raw_spin_unlock_irq+0x2c/0x50 [176092.424276] [810dbbf2] lock_acquire+0xa2/0x1f0 tdown. [176092.430489] [8149028d] ? mei_wd_ops_start+0x2d/0xf0 [176092.438070] [8171d590] mutex_lock_nested+0x80/0x400 [176092.444772] [8149028d] ? mei_wd_ops_start+0x2d/0xf0 [176092.451471] [8149028d] ? mei_wd_ops_start+0x2d/0xf0 [176092.458172] [81557f22] ? watchdog_start+0x22/0x80 [176092.464678] [81557f22] ? watchdog_start+0x22/0x80 [176092.471182] [8149028d] mei_wd_ops_start+0x2d/0xf0 [176092.477687] [81557f5d] watchdog_start+0x5d/0x80 [176092.483994] [81558168] watchdog_open+0x88/0xf0 [176092.490214] [81453547] misc_open+0xb7/0x1d0 [176092.496128] [811e15d2] chrdev_open+0x92/0x1d0 [176092.502240] [811da57b] do_dentry_open+0x24b/0x300 [176092.508745] [812e8e7c] ? security_inode_permission+0x1c/0x30 [176092.516330] [811e1540] ? cdev_put+0x30/0x30 [176092.522243] [811da670] finish_open+0x40/0x50 [176092.528256] [811ec139] do_last+0x4d9/0xe40 [176092.534071] [811ecb53] path_openat+0xb3/0x530 [176092.540193] [810acc1f] ? local_clock+0x5f/0x70 [176092.546403] [8101fcf5] ? native_sched_clock+0x15/0x80 [176092.553301] [810d5d9d] ? trace_hardirqs_off+0xd/0x10 [176092.560099] [811ed658] do_filp_open+0x38/0x80 [176092.566211] [81720c77] ? _raw_spin_unlock+0x27/0x40 [176092.572913] [811fc39f] ? __alloc_fd+0xaf/0x200 [176092.579123] [811db9a9] do_sys_open+0xe9/0x1c0 [176092.585235] [811dba9e] SyS_open+0x1e/0x20 [176092.590953] [8172a999] system_call_fastpath+0x16/0x1b Sending SIGTERM to remaining processes... [176092.622745] systemd-journald[338]: Received SIGTERM Sending SIGKILL
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Sun, May 12, 2013 at 06:00:38PM +0530, Kashyap Chamarthy wrote: I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try it please. I don't see the above SeaBIOS hang, however I'm able to consistently reproduce this stack trace when booting L1 guest: You mean L2 here? L2 guest cannot find root file system. Unlikely related to KVM. [2.516894] VFS: Cannot open root device mapper/fedora-root or unknown-block(0,0): error -6 [2.527636] Please append a correct root= boot option; here are the available partitions: [2.538792] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [2.539716] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1 [2.539716] Call Trace: [2.539716] [81649c19] panic+0xc1/0x1d0 [2.539716] [81d010e0] mount_block_root+0x1fa/0x2ac [2.539716] [81d011e9] mount_root+0x57/0x5b [2.539716] [81d0132a] prepare_namespace+0x13d/0x176 [2.539716] [81d00e1c] kernel_init_freeable+0x1cf/0x1da [2.539716] [81d00610] ? do_early_param+0x8c/0x8c [2.539716] [81637ca0] ? rest_init+0x80/0x80 [2.539716] [81637cae] kernel_init+0xe/0xf0 [2.539716] [8165bd6c] ret_from_fork+0x7c/0xb0 [2.539716] [81637ca0] ? rest_init+0x80/0x80 [2.539716] Uhhuh. NMI received for unknown reason 30 on CPU 1. [2.539716] Do you have a strange power saving mode enabled? [2.539716] Dazed and confused, but trying to continue [2.539716] Uhhuh. NMI received for unknown reason 20 on CPU 1. Howver, L1 boots just fine. When I try to boot L2, it throws this different stack trace. Who is it? The stack trace bellow is from L0 judging by hardware name. Again not KVM related. [176092.303585] lock(dev-device_lock); [176092.307947] [176092.307947] *** DEADLOCK *** [176092.307947] [176092.314943] 2 locks held by systemd/1: [176092.319283] #0: (misc_mtx){+.+.+.}, at: [814534b8] misc_open+0x28/0x1d0 [176092.328104] #1: (wdd-lock){+.+...}, at: [81557f22] watchdog_start+0x22/0x80 [176092.337532] [176092.337532] stack backtrace: [176092.342661] CPU: 1 PID: 1 Comm: systemd Not tainted 3.10.0-0.rc0.git23.1.fc20.x86_64 #1 [176092.351823] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013 [176092.366101] 8257d070 880241b1b9c0 81719128 880241b1ba00 [176092.374617] 81714d75 880241b1ba50 880241b80960 880241b8 [176092.383130] 0002 0002 880241b80960 880241b1bac0 [176092.391647] Call Trace: [176092.394514] [81719128] dump_stack+0x19/0x1b 2m OK ] Re[176092.400430] [81714d75] print_circular_bug+0x201/0x210 [176092.408898] [810db094] __lock_acquire+0x17c4/0x1b30 ached target Shu[176092.415602] [81720d7c] ? _raw_spin_unlock_irq+0x2c/0x50 [176092.424276] [810dbbf2] lock_acquire+0xa2/0x1f0 tdown. [176092.430489] [8149028d] ? mei_wd_ops_start+0x2d/0xf0 [176092.438070] [8171d590] mutex_lock_nested+0x80/0x400 [176092.444772] [8149028d] ? mei_wd_ops_start+0x2d/0xf0 [176092.451471] [8149028d] ? mei_wd_ops_start+0x2d/0xf0 [176092.458172] [81557f22] ? watchdog_start+0x22/0x80 [176092.464678] [81557f22] ? watchdog_start+0x22/0x80 [176092.471182] [8149028d] mei_wd_ops_start+0x2d/0xf0 [176092.477687] [81557f5d] watchdog_start+0x5d/0x80 [176092.483994] [81558168] watchdog_open+0x88/0xf0 [176092.490214] [81453547] misc_open+0xb7/0x1d0 [176092.496128] [811e15d2] chrdev_open+0x92/0x1d0 [176092.502240] [811da57b] do_dentry_open+0x24b/0x300 [176092.508745] [812e8e7c] ? security_inode_permission+0x1c/0x30 [176092.516330] [811e1540] ? cdev_put+0x30/0x30 [176092.522243] [811da670] finish_open+0x40/0x50 [176092.528256] [811ec139] do_last+0x4d9/0xe40 [176092.534071] [811ecb53] path_openat+0xb3/0x530 [176092.540193] [810acc1f] ? local_clock+0x5f/0x70 [176092.546403] [8101fcf5] ? native_sched_clock+0x15/0x80 [176092.553301] [810d5d9d] ? trace_hardirqs_off+0xd/0x10 [176092.560099] [811ed658] do_filp_open+0x38/0x80 [176092.566211] [81720c77] ? _raw_spin_unlock+0x27/0x40
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Sun, May 12, 2013 at 6:08 PM, Gleb Natapov g...@redhat.com wrote: On Sun, May 12, 2013 at 06:00:38PM +0530, Kashyap Chamarthy wrote: I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try it please. I don't see the above SeaBIOS hang, however I'm able to consistently reproduce this stack trace when booting L1 guest: You mean L2 here? Yes. (Sorry about that.) L2 guest cannot find root file system. Unlikely related to KVM. Yeah, fair enough. Howver, L1 boots just fine. When I try to boot L2, it throws this different stack trace. Who is it? The stack trace bellow is from L0 judging by hardware name. Again not KVM related. Again, sorry :(. I was just about to reply that this was physical host. I'm testing by disabling VMCS Shadowing per Jan Kiszka's suggestion, and retrying. But I doubt that's the reason my L2 is seg-faulting. If it still fails, I'll try to create a new L2 to see I can reproduce more consistently. Thanks for your response. /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
Kashyap Chamarthy kashyap...@gmail.com wrote on 12/05/2013 03:42:33 PM: Again, sorry :(. I was just about to reply that this was physical host. I'm testing by disabling VMCS Shadowing per Jan Kiszka's suggestion, and retrying. But I doubt that's the reason my L2 is seg-faulting. If it still fails, I'll try to create a new L2 to see I can reproduce more consistently. I doubt shadow-vmcs is related to this issue. Note shadow vmcs is disabled unless you have a processor that supports this feature. Do you ?! Also note you can disable shadow-vmcs using the kvm-intel kernel module parameter enable_shadow_vmcs. Anyway, if you conclude this is related to shadow-vmcs let me know and I'll try to help. Regards, Abel. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Sun, May 12, 2013 at 6:29 PM, Abel Gordon ab...@il.ibm.com wrote: Kashyap Chamarthy kashyap...@gmail.com wrote on 12/05/2013 03:42:33 PM: Again, sorry :(. I was just about to reply that this was physical host. I'm testing by disabling VMCS Shadowing per Jan Kiszka's suggestion, and retrying. But I doubt that's the reason my L2 is seg-faulting. If it still fails, I'll try to create a new L2 to see I can reproduce more consistently. I doubt shadow-vmcs is related to this issue. Indeed. I just re-tested w/o it, and it has no effect. I'm trying a guest w/ newer kernel in L2. Note shadow vmcs is disabled unless you have a processor that supports this feature. Do you ?! Yes, I noted this in my previous email. I'm using Intel Haswell. Here's the info from MSR bits on the machine(From `Table 35-3`, MSRs in Procesors Based on Intel Core Microarchitecture, `Volume 3C of the SDM ) # Read msr value $ rdmsr 0x48B 7cff # Check Shadow VMCS is enabled: $ rdmsr 0x0485 300481e5 And, on the Kernel command line: # nested $ cat /sys/module/kvm_intel/parameters/nested Y # shadow VMCS $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs Y Just for reference, here's the detailed procedure I noted while testing it on Haswell -- https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst Also note you can disable shadow-vmcs using the kvm-intel kernel module parameter enable_shadow_vmcs. Yes, to test w/o shadow VMCS, I disabled it by adding options kvm-intel enable_shadow_vmcs=y to /etc/modprobe.d/dist.conf reboot the host. Anyway, if you conclude this is related to shadow-vmcs let me know and I'll try to help. So, from the above info shadow-vmcs is ruled-out. I'm trying to investigate further, will post details if I have new findings. Thank you for your help. /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
So, from the above info shadow-vmcs is ruled-out. I'm trying to investigate further, will post details if I have new findings. Update: - I just tried to create L2 w/ Fedora-19 TC4 compose of 11MAY2013, I contibuously see the below fragment (F18/F19, whatever the L2 guest is): [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message However I have neither HPET or NMI Watchdog enabled on L0/L1. I checked it by: $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' I wonder if I'm missing something trivial, or maybe this is some kind of bug that needs more deeper investigation. But Jan Kiszka reported he isn't seeing any probs (but he isn't using Haswell ). Thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
[ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Well, not entirely stuck, it's moving, but w/ intermittent spitting of the above message. . . . Installing nss-softokn-freebl (10/236) [ 716.751098] Uhhuh. NMI received for unknown reason 30 on CPU 1. [ 716.751098] Do you have a strange power saving mode enabled? Installing glibc-common (11/236)d, but trying to continue [ 735.785034] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 735.785034] Do you have a strange power saving mode enabled? [ 735.785034] Dazed and confused, but trying to continue [ 736.502032] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 736.502032] Do you have a strange power saving mode enabled? [ 736.502032] Dazed and confused, but trying to continue [ 737.204936] Uhhuh. NMI received for unknown reason 20 on CPU 1. [ 737.205051] Do you have a strange power saving mode enabled? Installing glibc (12/236)confused, but trying to continue . . . /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
Kashyap Chamarthy kashyap...@gmail.com wrote on 12/05/2013 04:06:40 PM: Note shadow vmcs is disabled unless you have a processor that supports this feature. Do you ?! Yes, I noted this in my previous email. I'm using Intel Haswell. Here's the info from MSR bits on the machine(From `Table 35-3`, MSRs in Procesors Based on Intel Core Microarchitecture, `Volume 3C of the SDM ) # Read msr value $ rdmsr 0x48B 7cff # Check Shadow VMCS is enabled: $ rdmsr 0x0485 300481e5 And, on the Kernel command line: # nested $ cat /sys/module/kvm_intel/parameters/nested Y # shadow VMCS $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs Y Yep, shadow-vmcs enabled :) Just for reference, here's the detailed procedure I noted while testing it on Haswell -- https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst Also note you can disable shadow-vmcs using the kvm-intel kernel module parameter enable_shadow_vmcs. Yes, to test w/o shadow VMCS, I disabled it by adding options kvm-intel enable_shadow_vmcs=y to /etc/modprobe.d/dist.conf reboot the host. I assume you meant enable_shadow_vmcs=n :) Small question: did you try to disable apicv/posted interrupts at L0 ? (for L1 you can't enable these features because they are not emulated) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Sun, May 12, 2013 at 04:48:28PM +0300, Abel Gordon wrote: Kashyap Chamarthy kashyap...@gmail.com wrote on 12/05/2013 04:06:40 PM: Note shadow vmcs is disabled unless you have a processor that supports this feature. Do you ?! Yes, I noted this in my previous email. I'm using Intel Haswell. Here's the info from MSR bits on the machine(From `Table 35-3`, MSRs in Procesors Based on Intel Core Microarchitecture, `Volume 3C of the SDM ) # Read msr value $ rdmsr 0x48B 7cff # Check Shadow VMCS is enabled: $ rdmsr 0x0485 300481e5 And, on the Kernel command line: # nested $ cat /sys/module/kvm_intel/parameters/nested Y # shadow VMCS $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs Y Yep, shadow-vmcs enabled :) Just for reference, here's the detailed procedure I noted while testing it on Haswell -- https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst Also note you can disable shadow-vmcs using the kvm-intel kernel module parameter enable_shadow_vmcs. Yes, to test w/o shadow VMCS, I disabled it by adding options kvm-intel enable_shadow_vmcs=y to /etc/modprobe.d/dist.conf reboot the host. I assume you meant enable_shadow_vmcs=n :) Small question: did you try to disable apicv/posted interrupts at L0 ? (for L1 you can't enable these features because they are not emulated) AFAIK Haswell does not have apicv/posted interrupts. Not the one I have access to anyway. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Sun, May 12, 2013 at 07:01:43PM +0530, Kashyap Chamarthy wrote: So, from the above info shadow-vmcs is ruled-out. I'm trying to investigate further, will post details if I have new findings. Update: - I just tried to create L2 w/ Fedora-19 TC4 compose of 11MAY2013, I contibuously see the below fragment (F18/F19, whatever the L2 guest is): [ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? However I have neither HPET or NMI Watchdog enabled on L0/L1. I checked it by: $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. I wonder if I'm missing something trivial, or maybe this is some kind of bug that needs more deeper investigation. But Jan Kiszka reported he isn't seeing any probs (but he isn't using Haswell ). Thanks. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
Yep, shadow-vmcs enabled :) :) Good to clarify. Just for reference, here's the detailed procedure I noted while testing it on Haswell -- https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst Also note you can disable shadow-vmcs using the kvm-intel kernel module parameter enable_shadow_vmcs. Yes, to test w/o shadow VMCS, I disabled it by adding options kvm-intel enable_shadow_vmcs=y to /etc/modprobe.d/dist.conf reboot the host. I assume you meant enable_shadow_vmcs=n :) Yes, oops, typo :) Small question: did you try to disable apicv/posted interrupts at L0 ? I don't have to explicitly disable. Like Gleb (correctly) noted in his response, APIC-V is not present on Haswell machines. So, it's disabled by default. $ cat /sys/module/kvm_intel/parameters/enable_apicv N (Side note: I did post the out o/p of the above parameters and more in the SETUP-nVMX.rst notes I pointed above. But, I understand, that document is a bit large :) ) (for L1 you can't enable these features because they are not emulated) Yes, Paolo clarified this to me on IRC, when I erroneously assumed so. Thanks Paolo ! Thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
[ 217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 217.938034] Do you have a strange power saving mode enabled? .[ 222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0. [ 222.524073] Do you have a strange power saving mode enabled? [ 222.524073] Dazed and confused, but trying to continue [ 243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0 . At the moment, L2 guest creation stuck at the above message Are those in L2 dmesg or L1? L2 dmesg. $ cat /etc/grub2.cfg | egrep -i 'hpet|nmi' IIRC watchdog is enabled by default. Indeed, you're right. I disabled NMI on L1, and rebooted the newly created L2 guest starts just fine. I'm cloning it to run another instance of L2. And, later try some kernel compiles inside L2 to see if I can consistently get some measurable numbers. Thanks all for your help. /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
Side note: While testing nVMX, I was hitting a libvirt bug, and filed this one: -- https://bugzilla.redhat.com/show_bug.cgi?id=961665 -- [virsh] Attempt to force destroy a guest fails due to 'unknown' reason, leaves a defunct qemu process which I was told is possibly a Kernel/KVM bug. Any further insights here ? Also, others testing with Libvirt (versions mentioned in the above bug), are you also seeing this ? Thanks. /kashyap Also, others testing w/ Libvirt - do you also see this ? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: = $ rpm -q kernel --changelog | head -2 * Thu May 09 2013 Josh Boyer - 3.10.0-0.rc0.git23.1 - Linux v3.9-11789-ge0fd9af = = $ uname -r ; rpm -q qemu-kvm libvirt-daemon-kvm libguestfs 3.10.0-0.rc0.git23.1.fc20.x86_64 qemu-kvm-1.4.1-1.fc19.x86_64 libvirt-daemon-kvm-1.0.5-2.fc19.x86_64 libguestfs-1.21.35-1.fc19.x86_64 = Additionally, neither nmi_watchdog, nor hpet enabled on L0 L1 kernels: = $ egrep -i 'nmi|hpet' /etc/grub2.cfg $ = KVM parameters on L0 : = $ cat /sys/module/kvm_intel/parameters/nested Y $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs Y $ cat /sys/module/kvm_intel/parameters/enable_apicv N $ cat /sys/module/kvm_intel/parameters/ept Y = - That's the stack trace I'm seeing, when I start the L2 guest: ... [2.162235] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [2.163080] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1 [2.163080] Call Trace: [2.163080] [81649c19] panic+0xc1/0x1d0 [2.163080] [81d010e0] mount_block_root+0x1fa/0x2ac [2.163080] [81d011e9] mount_root+0x57/0x5b [2.163080] [81d0132a] prepare_namespace+0x13d/0x176 [2.163080] [81d00e1c] kernel_init_freeable+0x1cf/0x1da [2.163080] [81d00610] ? do_early_param+0x8c/0x8c [2.163080] [81637ca0] ? rest_init+0x80/0x80 [2.163080] [81637cae] kernel_init+0xe/0xf0 [2.163080] [8165bd6c] ret_from_fork+0x7c/0xb0 [2.163080] [81637ca0] ? rest_init+0x80/0x80 [2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0. [2.163080] Do you have a strange power saving mode enabled? [2.163080] Dazed and confused, but trying to continue [2.163080] Uhhuh. NMI received for unknown reason 20 on CPU 0. [2.163080] Do you have a strange power saving mode enabled? [2.163080] Dazed and confused, but trying to continue [2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0. I'm able to reproduce to reproduce this consistently. L1 QEMU command-line: $ ps -ef | grep -i qemu qemu 4962 1 21 15:41 ?00:00:41 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu Haswell,+vmx -m 6144 -smp 4,sockets=4,cores=1,threads=1 -uuid 4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 L2 QEMU command-line: $ qemu 2042 1 0 May09 ?00:05:03 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name nested-guest -S -machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid 02ea8988-1054-b08b-bafe-cfbe9659976c -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/nested-guest.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/test/vmimages/nested-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:65:c4:e6,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 I attached the vmxcap script output. Before I debug further, anyone has hints here? Many thanks in advance. [1] Notes -- https://github.com/kashyapc/nested-virt-notes-intel-f18 /kashyap Basic VMX Information Revision 18 VMCS size1024 VMCS restricted to 32 bit addresses no Dual-monitor support yes VMCS memory type 6 INS/OUTS instruction information yes IA32_VMX_TRUE_*_CTLS
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Fri, May 10, 2013 at 8:03 PM, Kashyap Chamarthy kashyap...@gmail.com wrote: Also, I'm able to reproduce this consistently: When I create an L2 guest: -- [.] [ 463.655031] Dazed and confused, but trying to continue [ 463.975563] Uhhuh. NMI received for unknown reason 20 on CPU 1. [ 463.976040] Do you have a strange power saving mode enabled? [ 463.976040] Dazed and confused, but trying to continue 29 199M 29 58.7M0 0 136k 0 0:25:02 0:07:20 0:17:42 153k [ 465.136405] Uhhuh. NMI received for unknown reason 30 on CPU 1. [ 465.137042] Do you have a strange power saving mode enabled? [ 465.137042] Dazed and confused, but trying to continue [ 466.645818] Uhhuh. NMI received for unknown reason 20 on CPU 1. [ 466.646044] Do you have a strange power saving mode enabled? [ 466.646044] Dazed and confused, but trying to continue [ 466.907999] Uhhuh. NMI received for unknown reason 30 on CPU 1. [ 466.908033] Do you have a strange power saving mode enabled? -- On Fri, May 10, 2013 at 6:30 PM, Kashyap Chamarthy kashyap...@gmail.com wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: = $ rpm -q kernel --changelog | head -2 * Thu May 09 2013 Josh Boyer - 3.10.0-0.rc0.git23.1 - Linux v3.9-11789-ge0fd9af = = $ uname -r ; rpm -q qemu-kvm libvirt-daemon-kvm libguestfs 3.10.0-0.rc0.git23.1.fc20.x86_64 qemu-kvm-1.4.1-1.fc19.x86_64 libvirt-daemon-kvm-1.0.5-2.fc19.x86_64 libguestfs-1.21.35-1.fc19.x86_64 = Additionally, neither nmi_watchdog, nor hpet enabled on L0 L1 kernels: = $ egrep -i 'nmi|hpet' /etc/grub2.cfg $ = KVM parameters on L0 : = $ cat /sys/module/kvm_intel/parameters/nested Y $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs Y $ cat /sys/module/kvm_intel/parameters/enable_apicv N $ cat /sys/module/kvm_intel/parameters/ept Y = - That's the stack trace I'm seeing, when I start the L2 guest: ... [2.162235] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [2.163080] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1 [2.163080] Call Trace: [2.163080] [81649c19] panic+0xc1/0x1d0 [2.163080] [81d010e0] mount_block_root+0x1fa/0x2ac [2.163080] [81d011e9] mount_root+0x57/0x5b [2.163080] [81d0132a] prepare_namespace+0x13d/0x176 [2.163080] [81d00e1c] kernel_init_freeable+0x1cf/0x1da [2.163080] [81d00610] ? do_early_param+0x8c/0x8c [2.163080] [81637ca0] ? rest_init+0x80/0x80 [2.163080] [81637cae] kernel_init+0xe/0xf0 [2.163080] [8165bd6c] ret_from_fork+0x7c/0xb0 [2.163080] [81637ca0] ? rest_init+0x80/0x80 [2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0. [2.163080] Do you have a strange power saving mode enabled? [2.163080] Dazed and confused, but trying to continue [2.163080] Uhhuh. NMI received for unknown reason 20 on CPU 0. [2.163080] Do you have a strange power saving mode enabled? [2.163080] Dazed and confused, but trying to continue [2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0. I'm able to reproduce to reproduce this consistently. L1 QEMU command-line: $ ps -ef | grep -i qemu qemu 4962 1 21 15:41 ?00:00:41 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu Haswell,+vmx -m 6144 -smp 4,sockets=4,cores=1,threads=1 -uuid 4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 L2 QEMU command-line: $ qemu 2042 1 0 May09 ?00:05:03 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name nested-guest -S -machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid 02ea8988-1054-b08b-bafe-cfbe9659976c -nographic -no-user-config -nodefaults -chardev
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: = $ rpm -q kernel --changelog | head -2 * Thu May 09 2013 Josh Boyer - 3.10.0-0.rc0.git23.1 - Linux v3.9-11789-ge0fd9af Please recheck with kvm.git, next branch. Thanks, Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: = $ rpm -q kernel --changelog | head -2 * Thu May 09 2013 Josh Boyer - 3.10.0-0.rc0.git23.1 - Linux v3.9-11789-ge0fd9af Please recheck with kvm.git, next branch. Hmm, looks like your branch already contains the patch I was thinking of (03b28f8). You could try if leaving shadow VMCS off makes a difference, but I bet that is unrelated. You get that backtrace in L1, correct? I'll have to see if I can reproduce it. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: = $ rpm -q kernel --changelog | head -2 * Thu May 09 2013 Josh Boyer - 3.10.0-0.rc0.git23.1 - Linux v3.9-11789-ge0fd9af Please recheck with kvm.git, next branch. Hmm, looks like your branch already contains the patch I was thinking of (03b28f8). Yes. You could try if leaving shadow VMCS off makes a difference, but I bet that is unrelated. Right. I could try. But, like you said, does it *really* make a difference. You get that backtrace in L1, correct? Yes. If you have any further tracing pointers, I could do some debugging. I'll have to see if I can reproduce it. Thanks. If you're looking for a clear reproducer, this is how I conducted my tests, and here's where I'm capturing all of the related work: [1] Setup -- https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst [2] Simple scripts used to create L1 and L2 -- https://github.com/kashyapc/nvmx-haswell/tree/master/tests/scripts [3] Libvirt XMLs I used (for reference) -- https://github.com/kashyapc/nvmx-haswell/tree/master/tests/libvirt-xmls-for-l1-l2 Thanks in advance. /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
[3] Libvirt XMLs I used (for reference) -- https://github.com/kashyapc/nvmx-haswell/tree/master/tests/libvirt-xmls-for-l1-l2 Oops, forgot to add, here we go -- https://github.com/kashyapc/nvmx-haswell/tree/master/tests/libvirt-xmls-for-l1-l2 /kashyap -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On 2013-05-10 17:39, Kashyap Chamarthy wrote: On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: = $ rpm -q kernel --changelog | head -2 * Thu May 09 2013 Josh Boyer - 3.10.0-0.rc0.git23.1 - Linux v3.9-11789-ge0fd9af Please recheck with kvm.git, next branch. Hmm, looks like your branch already contains the patch I was thinking of (03b28f8). Yes. You could try if leaving shadow VMCS off makes a difference, but I bet that is unrelated. Right. I could try. But, like you said, does it *really* make a difference. We know after you tried. I don't have access to a Haswell box, so we better exclude this beforehand. You get that backtrace in L1, correct? Yes. If you have any further tracing pointers, I could do some debugging. Thanks, I may come back to you if reproduction fails here. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:39, Kashyap Chamarthy wrote: On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On 2013-05-10 19:40, Nakajima, Jun wrote: On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:39, Kashyap Chamarthy wrote: On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. Hmm, no such issues here ATM although I'm on 3.9 for L1 as well. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
We know after you tried. I don't have access to a Haswell box, so we better exclude this beforehand. Fair enough. I'll try that too, and let you know. You get that backtrace in L1, correct? Yes. If you have any further tracing pointers, I could do some debugging. Thanks, I may come back to you if reproduction fails here. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
On Fri, May 10, 2013 at 11:39 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 19:40, Nakajima, Jun wrote: On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:39, Kashyap Chamarthy wrote: On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2013-05-10 17:12, Jan Kiszka wrote: On 2013-05-10 15:00, Kashyap Chamarthy wrote: Heya, This is on Intel Haswell. First, some version info: L0, L1 -- both of them have same versions of kernel, qemu: I tried to reproduce such a problem, and I found L2 (Linux) hangs in SeaBIOS, after line iPXE (http://ipxe.org) It happens with or w/o VMCS shadowing (and even without my virtual EPT patches). I didn't realize this problem until I updated the L1 kernel to the latest (e.g. 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible that the L1 kernel exposed a bug with the nested virtualization, as we saw such cases before. Hmm, no such issues here ATM although I'm on 3.9 for L1 as well. Interesting. But Jan, you're not using an Haswell machine, right ? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html