Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 11/27/2015 09:42 PM, Tyler Baker wrote: On 27 November 2015 at 10:53, Tyler Bakerwrote: On 27 November 2015 at 09:08, Tyler Baker wrote: On 27 November 2015 at 00:54, Christian Borntraeger wrote: On 11/26/2015 09:47 PM, Christian Borntraeger wrote: On 11/26/2015 05:17 PM, Tyler Baker wrote: Hi Christian, The kernelci.org bot recently has been reporting kvm guest boot failures[1] on various arm64 platforms in next-20151126. The bot bisected[2] the failures to the commit in -next titled "KVM: Create debugfs dir and stat files for each VM". I confirmed by reverting this commit on top of next-20151126 it resolves the boot issue. In this test case the host and guest are booted with the same kernel. The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), and launches a guest. The host is booting fine, but when the guest is launched it errors with "Failed to retrieve host CPU features!". I checked the host logs, and found an "Unable to handle kernel paging request" splat[3] which occurs when the guest is attempting to start. I scanned the patch in question but nothing obvious jumped out at me, any thoughts? Not really. Do you have processing running that do read the files in /sys/kernel/debug/kvm/* ? If I read the arm oops message correctly it oopsed inside __srcu_read_lock. there is actually nothing in there that can oops, except the access to the preempt count. I am just guessing right now, but maybe the preempt variable is no longer available (as the process is gone). As long as a debugfs file is open, we hold a reference to the kvm, which holds a reference to the mm, so the mm might be killed after the process. But this is supposed to work, so maybe its something different. An objdump of __srcu_read_lock might help. Hmm, the preempt thing is done in srcu_read_lock, but the crash is in __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, which must be present and is initialized during boot. int __srcu_read_lock(struct srcu_struct *sp) { int idx; idx = READ_ONCE(sp->completed) & 0x1; __this_cpu_inc(sp->per_cpu_ref->c[idx]); smp_mb(); /* B */ /* Avoid leaking the critical section. */ __this_cpu_inc(sp->per_cpu_ref->seq[idx]); return idx; } Looking at the code I have no clue why the patch does make a difference. Can you try to get an objdump -S for__Srcu_read_lock? Some other interesting finding below... On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/ Running strace on the qemu command I use to launch the guest yields the following. [pid 5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000 [pid 5963] 1448649724.405586 read(13, "MemTotal: 16414616 kB\nMemF"..., 1024) = 1024 [pid 5963] 1448649724.405699 close(13) = 0 [pid 5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0 [pid 5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000 [pid 5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm", O_RDWR|O_CLOEXEC) = 13 [pid 5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM (Cannot allocate memory) If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots fine. I put some printk's in the kvm_create_vm_debugfs() function and it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was chatting with some folks from the Linaro virtualization team and they mentioned that ARM is a bit special as the same PID creates two vms in quick succession, the first one is a scratch vm, and the other is the 'real' vm. With that bit of info, I suspect we may be trying to create the debugfs directory twice, and the second time it's failing because it already exists. Cheers, Tyler After a quick look into qemu I guess I've found the problem: kvm_init creates a vm, does checking and self initialization and then calls kvm_arch_init. The arch initialization indirectly calls kvm_arm_create_scratch_host_vcpu and that's where the trouble begins, as it also creates a VM. My assumption was, that nobody would create multiple VMs under the same PID. Christian and I are working on a solution on kernel side. Cheers Janosch -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 30 November 2015 at 00:38, Christian Borntraegerwrote: > On 11/27/2015 09:42 PM, Tyler Baker wrote: >> On 27 November 2015 at 10:53, Tyler Baker wrote: >>> On 27 November 2015 at 09:08, Tyler Baker wrote: On 27 November 2015 at 00:54, Christian Borntraeger wrote: > On 11/26/2015 09:47 PM, Christian Borntraeger wrote: >> On 11/26/2015 05:17 PM, Tyler Baker wrote: >>> Hi Christian, >>> >>> The kernelci.org bot recently has been reporting kvm guest boot >>> failures[1] on various arm64 platforms in next-20151126. The bot >>> bisected[2] the failures to the commit in -next titled "KVM: Create >>> debugfs dir and stat files for each VM". I confirmed by reverting this >>> commit on top of next-20151126 it resolves the boot issue. >>> >>> In this test case the host and guest are booted with the same kernel. >>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), >>> and launches a guest. The host is booting fine, but when the guest is >>> launched it errors with "Failed to retrieve host CPU features!". I >>> checked the host logs, and found an "Unable to handle kernel paging >>> request" splat[3] which occurs when the guest is attempting to start. >>> >>> I scanned the patch in question but nothing obvious jumped out at me, >>> any thoughts? >> >> Not really. >> Do you have processing running that do read the files in >> /sys/kernel/debug/kvm/* ? >> >> If I read the arm oops message correctly it oopsed inside >> __srcu_read_lock. there is actually nothing in there that can oops, >> except the access to the preempt count. I am just guessing right now, >> but maybe the preempt variable is no longer available (as the process >> is gone). As long as a debugfs file is open, we hold a reference to >> the kvm, which holds a reference to the mm, so the mm might be killed >> after the process. But this is supposed to work, so maybe its something >> different. An objdump of __srcu_read_lock might help. > > Hmm, the preempt thing is done in srcu_read_lock, but the crash is in > __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, > which must be present and is initialized during boot. > > > int __srcu_read_lock(struct srcu_struct *sp) > { > int idx; > > idx = READ_ONCE(sp->completed) & 0x1; > __this_cpu_inc(sp->per_cpu_ref->c[idx]); > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > __this_cpu_inc(sp->per_cpu_ref->seq[idx]); > return idx; > } > > Looking at the code I have no clue why the patch does make a difference. > Can you try to get an objdump -S for__Srcu_read_lock? >>> >>> Some other interesting finding below... >>> >>> On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/ >>> >>> Running strace on the qemu command I use to launch the guest yields >>> the following. >>> >>> [pid 5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000 >>> [pid 5963] 1448649724.405586 read(13, "MemTotal: 16414616 >>> kB\nMemF"..., 1024) = 1024 >>> [pid 5963] 1448649724.405699 close(13) = 0 >>> [pid 5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0 >>> [pid 5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000 >>> [pid 5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm", >>> O_RDWR|O_CLOEXEC) = 13 >>> [pid 5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM >>> (Cannot allocate memory) >> >> If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots >> fine. I put some printk's in the kvm_create_vm_debugfs() function and >> it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was >> chatting with some folks from the Linaro virtualization team and they >> mentioned that ARM is a bit special as the same PID creates two vms in >> quick succession, the first one is a scratch vm, and the other is the >> 'real' vm. With that bit of info, I suspect we may be trying to create >> the debugfs directory twice, and the second time it's failing because >> it already exists. > > Hmmm, with a patched QEMU that calls VM_CREATE twice it errors out on s390 > with -ENOMEM (which it should not), but it errors out gracefully. > > Does the attached patch avoid the crash? (guest will not start, but qemu > should error out gracefully with ENOMEM) Yeah. I patched my host kernel and now the qemu guest launch errors gracefully[1]. Cheers, Tyler [1] http://hastebin.com/rotiropayo.mel -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
Janosch Frankwrites: > On 11/27/2015 09:42 PM, Tyler Baker wrote: >> On 27 November 2015 at 10:53, Tyler Baker wrote: >>> On 27 November 2015 at 09:08, Tyler Baker wrote: On 27 November 2015 at 00:54, Christian Borntraeger wrote: > On 11/26/2015 09:47 PM, Christian Borntraeger wrote: >> On 11/26/2015 05:17 PM, Tyler Baker wrote: >>> Hi Christian, >>> >>> The kernelci.org bot recently has been reporting kvm guest boot >>> failures[1] on various arm64 platforms in next-20151126. The bot >>> bisected[2] the failures to the commit in -next titled "KVM: Create >>> debugfs dir and stat files for each VM". I confirmed by reverting this >>> commit on top of next-20151126 it resolves the boot issue. >>> >> > After a quick look into qemu I guess I've found the problem: > kvm_init creates a vm, does checking and self initialization and > then calls kvm_arch_init. The arch initialization indirectly > calls kvm_arm_create_scratch_host_vcpu and that's where the > trouble begins, as it also creates a VM. > > My assumption was, that nobody would create multiple VMs under > the same PID. Christian and I are working on a solution on kernel > side. Yeah ARM is a little weird in that respect as the scratch VM is used to probe capabilities. There is nothing in the API that says you can't have multiple VMs per PID so I guess a better unique identifier is needed. -- Alex Bennée -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 11/27/2015 09:42 PM, Tyler Baker wrote: > On 27 November 2015 at 10:53, Tyler Bakerwrote: >> On 27 November 2015 at 09:08, Tyler Baker wrote: >>> On 27 November 2015 at 00:54, Christian Borntraeger >>> wrote: On 11/26/2015 09:47 PM, Christian Borntraeger wrote: > On 11/26/2015 05:17 PM, Tyler Baker wrote: >> Hi Christian, >> >> The kernelci.org bot recently has been reporting kvm guest boot >> failures[1] on various arm64 platforms in next-20151126. The bot >> bisected[2] the failures to the commit in -next titled "KVM: Create >> debugfs dir and stat files for each VM". I confirmed by reverting this >> commit on top of next-20151126 it resolves the boot issue. >> >> In this test case the host and guest are booted with the same kernel. >> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), >> and launches a guest. The host is booting fine, but when the guest is >> launched it errors with "Failed to retrieve host CPU features!". I >> checked the host logs, and found an "Unable to handle kernel paging >> request" splat[3] which occurs when the guest is attempting to start. >> >> I scanned the patch in question but nothing obvious jumped out at me, >> any thoughts? > > Not really. > Do you have processing running that do read the files in > /sys/kernel/debug/kvm/* ? > > If I read the arm oops message correctly it oopsed inside > __srcu_read_lock. there is actually nothing in there that can oops, > except the access to the preempt count. I am just guessing right now, > but maybe the preempt variable is no longer available (as the process > is gone). As long as a debugfs file is open, we hold a reference to > the kvm, which holds a reference to the mm, so the mm might be killed > after the process. But this is supposed to work, so maybe its something > different. An objdump of __srcu_read_lock might help. Hmm, the preempt thing is done in srcu_read_lock, but the crash is in __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, which must be present and is initialized during boot. int __srcu_read_lock(struct srcu_struct *sp) { int idx; idx = READ_ONCE(sp->completed) & 0x1; __this_cpu_inc(sp->per_cpu_ref->c[idx]); smp_mb(); /* B */ /* Avoid leaking the critical section. */ __this_cpu_inc(sp->per_cpu_ref->seq[idx]); return idx; } Looking at the code I have no clue why the patch does make a difference. Can you try to get an objdump -S for__Srcu_read_lock? >> >> Some other interesting finding below... >> >> On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/ >> >> Running strace on the qemu command I use to launch the guest yields >> the following. >> >> [pid 5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000 >> [pid 5963] 1448649724.405586 read(13, "MemTotal: 16414616 >> kB\nMemF"..., 1024) = 1024 >> [pid 5963] 1448649724.405699 close(13) = 0 >> [pid 5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0 >> [pid 5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000 >> [pid 5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm", >> O_RDWR|O_CLOEXEC) = 13 >> [pid 5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM >> (Cannot allocate memory) > > If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots > fine. I put some printk's in the kvm_create_vm_debugfs() function and > it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was > chatting with some folks from the Linaro virtualization team and they > mentioned that ARM is a bit special as the same PID creates two vms in > quick succession, the first one is a scratch vm, and the other is the > 'real' vm. With that bit of info, I suspect we may be trying to create > the debugfs directory twice, and the second time it's failing because > it already exists. Hmmm, with a patched QEMU that calls VM_CREATE twice it errors out on s390 with -ENOMEM (which it should not), but it errors out gracefully. Does the attached patch avoid the crash? (guest will not start, but qemu should error out gracefully with ENOMEM) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f7d6c8f..b26472a 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -671,12 +671,16 @@ static struct kvm *kvm_create_vm(unsigned long type) r = kvm_create_vm_debugfs(kvm); if (r) - goto out_err; + goto out_mmu; preempt_notifier_inc(); return kvm; +out_mmu: +#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) + mmu_notifier_unregister(>mmu_notifier, kvm->mm); +#endif out_err: cleanup_srcu_struct(>irq_srcu); out_err_no_irq_srcu:
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 27 November 2015 at 00:54, Christian Borntraegerwrote: > On 11/26/2015 09:47 PM, Christian Borntraeger wrote: >> On 11/26/2015 05:17 PM, Tyler Baker wrote: >>> Hi Christian, >>> >>> The kernelci.org bot recently has been reporting kvm guest boot >>> failures[1] on various arm64 platforms in next-20151126. The bot >>> bisected[2] the failures to the commit in -next titled "KVM: Create >>> debugfs dir and stat files for each VM". I confirmed by reverting this >>> commit on top of next-20151126 it resolves the boot issue. >>> >>> In this test case the host and guest are booted with the same kernel. >>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), >>> and launches a guest. The host is booting fine, but when the guest is >>> launched it errors with "Failed to retrieve host CPU features!". I >>> checked the host logs, and found an "Unable to handle kernel paging >>> request" splat[3] which occurs when the guest is attempting to start. >>> >>> I scanned the patch in question but nothing obvious jumped out at me, >>> any thoughts? >> >> Not really. >> Do you have processing running that do read the files in >> /sys/kernel/debug/kvm/* ? >> >> If I read the arm oops message correctly it oopsed inside >> __srcu_read_lock. there is actually nothing in there that can oops, >> except the access to the preempt count. I am just guessing right now, >> but maybe the preempt variable is no longer available (as the process >> is gone). As long as a debugfs file is open, we hold a reference to >> the kvm, which holds a reference to the mm, so the mm might be killed >> after the process. But this is supposed to work, so maybe its something >> different. An objdump of __srcu_read_lock might help. > > Hmm, the preempt thing is done in srcu_read_lock, but the crash is in > __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, > which must be present and is initialized during boot. > > > int __srcu_read_lock(struct srcu_struct *sp) > { > int idx; > > idx = READ_ONCE(sp->completed) & 0x1; > __this_cpu_inc(sp->per_cpu_ref->c[idx]); > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > __this_cpu_inc(sp->per_cpu_ref->seq[idx]); > return idx; > } > > Looking at the code I have no clue why the patch does make a difference. > Can you try to get an objdump -S for__Srcu_read_lock? Using next-20151126 as the base, here is the objdump[1] I came up with for__srcu_read_lock. Cheers, Tyler [1] http://hastebin.com/bifiqobola.pl -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 27 November 2015 at 09:08, Tyler Bakerwrote: > On 27 November 2015 at 00:54, Christian Borntraeger > wrote: >> On 11/26/2015 09:47 PM, Christian Borntraeger wrote: >>> On 11/26/2015 05:17 PM, Tyler Baker wrote: Hi Christian, The kernelci.org bot recently has been reporting kvm guest boot failures[1] on various arm64 platforms in next-20151126. The bot bisected[2] the failures to the commit in -next titled "KVM: Create debugfs dir and stat files for each VM". I confirmed by reverting this commit on top of next-20151126 it resolves the boot issue. In this test case the host and guest are booted with the same kernel. The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), and launches a guest. The host is booting fine, but when the guest is launched it errors with "Failed to retrieve host CPU features!". I checked the host logs, and found an "Unable to handle kernel paging request" splat[3] which occurs when the guest is attempting to start. I scanned the patch in question but nothing obvious jumped out at me, any thoughts? >>> >>> Not really. >>> Do you have processing running that do read the files in >>> /sys/kernel/debug/kvm/* ? >>> >>> If I read the arm oops message correctly it oopsed inside >>> __srcu_read_lock. there is actually nothing in there that can oops, >>> except the access to the preempt count. I am just guessing right now, >>> but maybe the preempt variable is no longer available (as the process >>> is gone). As long as a debugfs file is open, we hold a reference to >>> the kvm, which holds a reference to the mm, so the mm might be killed >>> after the process. But this is supposed to work, so maybe its something >>> different. An objdump of __srcu_read_lock might help. >> >> Hmm, the preempt thing is done in srcu_read_lock, but the crash is in >> __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, >> which must be present and is initialized during boot. >> >> >> int __srcu_read_lock(struct srcu_struct *sp) >> { >> int idx; >> >> idx = READ_ONCE(sp->completed) & 0x1; >> __this_cpu_inc(sp->per_cpu_ref->c[idx]); >> smp_mb(); /* B */ /* Avoid leaking the critical section. */ >> __this_cpu_inc(sp->per_cpu_ref->seq[idx]); >> return idx; >> } >> >> Looking at the code I have no clue why the patch does make a difference. >> Can you try to get an objdump -S for__Srcu_read_lock? Some other interesting finding below... On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/ Running strace on the qemu command I use to launch the guest yields the following. [pid 5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000 [pid 5963] 1448649724.405586 read(13, "MemTotal: 16414616 kB\nMemF"..., 1024) = 1024 [pid 5963] 1448649724.405699 close(13) = 0 [pid 5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0 [pid 5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000 [pid 5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm", O_RDWR|O_CLOEXEC) = 13 [pid 5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM (Cannot allocate memory) [pid 5963] 1448649724.406382 close(13) = 0 [pid 5963] 1448649724.406435 write(2, "Failed to retrieve host CPU feat"..., 38Failed to retrieve host CPU features! ) = 38 Tyler -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 27 November 2015 at 10:53, Tyler Bakerwrote: > On 27 November 2015 at 09:08, Tyler Baker wrote: >> On 27 November 2015 at 00:54, Christian Borntraeger >> wrote: >>> On 11/26/2015 09:47 PM, Christian Borntraeger wrote: On 11/26/2015 05:17 PM, Tyler Baker wrote: > Hi Christian, > > The kernelci.org bot recently has been reporting kvm guest boot > failures[1] on various arm64 platforms in next-20151126. The bot > bisected[2] the failures to the commit in -next titled "KVM: Create > debugfs dir and stat files for each VM". I confirmed by reverting this > commit on top of next-20151126 it resolves the boot issue. > > In this test case the host and guest are booted with the same kernel. > The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), > and launches a guest. The host is booting fine, but when the guest is > launched it errors with "Failed to retrieve host CPU features!". I > checked the host logs, and found an "Unable to handle kernel paging > request" splat[3] which occurs when the guest is attempting to start. > > I scanned the patch in question but nothing obvious jumped out at me, > any thoughts? Not really. Do you have processing running that do read the files in /sys/kernel/debug/kvm/* ? If I read the arm oops message correctly it oopsed inside __srcu_read_lock. there is actually nothing in there that can oops, except the access to the preempt count. I am just guessing right now, but maybe the preempt variable is no longer available (as the process is gone). As long as a debugfs file is open, we hold a reference to the kvm, which holds a reference to the mm, so the mm might be killed after the process. But this is supposed to work, so maybe its something different. An objdump of __srcu_read_lock might help. >>> >>> Hmm, the preempt thing is done in srcu_read_lock, but the crash is in >>> __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, >>> which must be present and is initialized during boot. >>> >>> >>> int __srcu_read_lock(struct srcu_struct *sp) >>> { >>> int idx; >>> >>> idx = READ_ONCE(sp->completed) & 0x1; >>> __this_cpu_inc(sp->per_cpu_ref->c[idx]); >>> smp_mb(); /* B */ /* Avoid leaking the critical section. */ >>> __this_cpu_inc(sp->per_cpu_ref->seq[idx]); >>> return idx; >>> } >>> >>> Looking at the code I have no clue why the patch does make a difference. >>> Can you try to get an objdump -S for__Srcu_read_lock? > > Some other interesting finding below... > > On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/ > > Running strace on the qemu command I use to launch the guest yields > the following. > > [pid 5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000 > [pid 5963] 1448649724.405586 read(13, "MemTotal: 16414616 > kB\nMemF"..., 1024) = 1024 > [pid 5963] 1448649724.405699 close(13) = 0 > [pid 5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0 > [pid 5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000 > [pid 5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm", > O_RDWR|O_CLOEXEC) = 13 > [pid 5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM > (Cannot allocate memory) If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots fine. I put some printk's in the kvm_create_vm_debugfs() function and it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was chatting with some folks from the Linaro virtualization team and they mentioned that ARM is a bit special as the same PID creates two vms in quick succession, the first one is a scratch vm, and the other is the 'real' vm. With that bit of info, I suspect we may be trying to create the debugfs directory twice, and the second time it's failing because it already exists. Cheers, Tyler -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 11/26/2015 09:47 PM, Christian Borntraeger wrote: > On 11/26/2015 05:17 PM, Tyler Baker wrote: >> Hi Christian, >> >> The kernelci.org bot recently has been reporting kvm guest boot >> failures[1] on various arm64 platforms in next-20151126. The bot >> bisected[2] the failures to the commit in -next titled "KVM: Create >> debugfs dir and stat files for each VM". I confirmed by reverting this >> commit on top of next-20151126 it resolves the boot issue. >> >> In this test case the host and guest are booted with the same kernel. >> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), >> and launches a guest. The host is booting fine, but when the guest is >> launched it errors with "Failed to retrieve host CPU features!". I >> checked the host logs, and found an "Unable to handle kernel paging >> request" splat[3] which occurs when the guest is attempting to start. >> >> I scanned the patch in question but nothing obvious jumped out at me, >> any thoughts? > > Not really. > Do you have processing running that do read the files in > /sys/kernel/debug/kvm/* ? > > If I read the arm oops message correctly it oopsed inside > __srcu_read_lock. there is actually nothing in there that can oops, > except the access to the preempt count. I am just guessing right now, > but maybe the preempt variable is no longer available (as the process > is gone). As long as a debugfs file is open, we hold a reference to > the kvm, which holds a reference to the mm, so the mm might be killed > after the process. But this is supposed to work, so maybe its something > different. An objdump of __srcu_read_lock might help. Hmm, the preempt thing is done in srcu_read_lock, but the crash is in __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, which must be present and is initialized during boot. int __srcu_read_lock(struct srcu_struct *sp) { int idx; idx = READ_ONCE(sp->completed) & 0x1; __this_cpu_inc(sp->per_cpu_ref->c[idx]); smp_mb(); /* B */ /* Avoid leaking the critical section. */ __this_cpu_inc(sp->per_cpu_ref->seq[idx]); return idx; } Looking at the code I have no clue why the patch does make a difference. Can you try to get an objdump -S for__Srcu_read_lock? > > I will drop it from my tree until we understand the problem > > Christian > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
On 11/26/2015 05:17 PM, Tyler Baker wrote: > Hi Christian, > > The kernelci.org bot recently has been reporting kvm guest boot > failures[1] on various arm64 platforms in next-20151126. The bot > bisected[2] the failures to the commit in -next titled "KVM: Create > debugfs dir and stat files for each VM". I confirmed by reverting this > commit on top of next-20151126 it resolves the boot issue. > > In this test case the host and guest are booted with the same kernel. > The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), > and launches a guest. The host is booting fine, but when the guest is > launched it errors with "Failed to retrieve host CPU features!". I > checked the host logs, and found an "Unable to handle kernel paging > request" splat[3] which occurs when the guest is attempting to start. > > I scanned the patch in question but nothing obvious jumped out at me, > any thoughts? Not really. Do you have processing running that do read the files in /sys/kernel/debug/kvm/* ? If I read the arm oops message correctly it oopsed inside __srcu_read_lock. there is actually nothing in there that can oops, except the access to the preempt count. I am just guessing right now, but maybe the preempt variable is no longer available (as the process is gone). As long as a debugfs file is open, we hold a reference to the kvm, which holds a reference to the mm, so the mm might be killed after the process. But this is supposed to work, so maybe its something different. An objdump of __srcu_read_lock might help. I will drop it from my tree until we understand the problem Christian -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM
Hi Christian, The kernelci.org bot recently has been reporting kvm guest boot failures[1] on various arm64 platforms in next-20151126. The bot bisected[2] the failures to the commit in -next titled "KVM: Create debugfs dir and stat files for each VM". I confirmed by reverting this commit on top of next-20151126 it resolves the boot issue. In this test case the host and guest are booted with the same kernel. The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), and launches a guest. The host is booting fine, but when the guest is launched it errors with "Failed to retrieve host CPU features!". I checked the host logs, and found an "Unable to handle kernel paging request" splat[3] which occurs when the guest is attempting to start. I scanned the patch in question but nothing obvious jumped out at me, any thoughts? Cheers, Tyler [1] http://kernelci.org/boot/all/job/next/kernel/next-20151126/ [2] http://hastebin.com/fuhicugate.vhdl [3]http://hastebin.com/yicefetuho.sm -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html