Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-30 Thread Janosch Frank

On 11/27/2015 09:42 PM, Tyler Baker wrote:

On 27 November 2015 at 10:53, Tyler Baker  wrote:

On 27 November 2015 at 09:08, Tyler Baker  wrote:

On 27 November 2015 at 00:54, Christian Borntraeger
 wrote:

On 11/26/2015 09:47 PM, Christian Borntraeger wrote:

On 11/26/2015 05:17 PM, Tyler Baker wrote:

Hi Christian,

The kernelci.org bot recently has been reporting kvm guest boot
failures[1] on various arm64 platforms in next-20151126. The bot
bisected[2] the failures to the commit in -next titled "KVM: Create
debugfs dir and stat files for each VM". I confirmed by reverting this
commit on top of next-20151126 it resolves the boot issue.

In this test case the host and guest are booted with the same kernel.
The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
and launches a guest. The host is booting fine, but when the guest is
launched it errors with "Failed to retrieve host CPU features!". I
checked the host logs, and found an "Unable to handle kernel paging
request" splat[3] which occurs when the guest is attempting to start.

I scanned the patch in question but nothing obvious jumped out at me,
any thoughts?

Not really.
Do you have processing running that do read the files in 
/sys/kernel/debug/kvm/* ?

If I read the arm oops message correctly it oopsed inside
__srcu_read_lock. there is actually nothing in there that can oops,
except the access to the preempt count. I am just guessing right now,
but maybe the preempt variable is no longer available (as the process
is gone). As long as a debugfs file is open, we hold a reference to
the kvm, which holds a reference to the mm, so the mm might be killed
after the process. But this is supposed to work, so maybe its something
different. An objdump of __srcu_read_lock might help.

Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
__srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
which must be present and is initialized during boot.


int __srcu_read_lock(struct srcu_struct *sp)
{
 int idx;

 idx = READ_ONCE(sp->completed) & 0x1;
 __this_cpu_inc(sp->per_cpu_ref->c[idx]);
 smp_mb(); /* B */  /* Avoid leaking the critical section. */
 __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
 return idx;
}

Looking at the code I have no clue why the patch does make a difference.
Can you try to get an objdump -S for__Srcu_read_lock?

Some other interesting finding below...

On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/

Running strace on the qemu command I use to launch the guest yields
the following.

[pid  5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000
[pid  5963] 1448649724.405586 read(13, "MemTotal:   16414616
kB\nMemF"..., 1024) = 1024
[pid  5963] 1448649724.405699 close(13) = 0
[pid  5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0
[pid  5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000
[pid  5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm",
O_RDWR|O_CLOEXEC) = 13
[pid  5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM
(Cannot allocate memory)

If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots
fine. I put some printk's in the kvm_create_vm_debugfs() function and
it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was
chatting with some folks from the Linaro virtualization team and they
mentioned that ARM is a bit special as the same PID creates two vms in
quick succession, the first one is a scratch vm, and the other is the
'real' vm. With that bit of info, I suspect we may be trying to create
the debugfs directory twice, and the second time it's failing because
it already exists.

Cheers,

Tyler


After a quick look into qemu I guess I've found the problem:
kvm_init creates a vm, does checking and self initialization and
then calls kvm_arch_init. The arch initialization indirectly
calls kvm_arm_create_scratch_host_vcpu and that's where the
trouble begins, as it also creates a VM.

My assumption was, that nobody would create multiple VMs under
the same PID. Christian and I are working on a solution on kernel
side.

Cheers
Janosch

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-30 Thread Tyler Baker
On 30 November 2015 at 00:38, Christian Borntraeger
 wrote:
> On 11/27/2015 09:42 PM, Tyler Baker wrote:
>> On 27 November 2015 at 10:53, Tyler Baker  wrote:
>>> On 27 November 2015 at 09:08, Tyler Baker  wrote:
 On 27 November 2015 at 00:54, Christian Borntraeger
  wrote:
> On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
>> On 11/26/2015 05:17 PM, Tyler Baker wrote:
>>> Hi Christian,
>>>
>>> The kernelci.org bot recently has been reporting kvm guest boot
>>> failures[1] on various arm64 platforms in next-20151126. The bot
>>> bisected[2] the failures to the commit in -next titled "KVM: Create
>>> debugfs dir and stat files for each VM". I confirmed by reverting this
>>> commit on top of next-20151126 it resolves the boot issue.
>>>
>>> In this test case the host and guest are booted with the same kernel.
>>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
>>> and launches a guest. The host is booting fine, but when the guest is
>>> launched it errors with "Failed to retrieve host CPU features!". I
>>> checked the host logs, and found an "Unable to handle kernel paging
>>> request" splat[3] which occurs when the guest is attempting to start.
>>>
>>> I scanned the patch in question but nothing obvious jumped out at me,
>>> any thoughts?
>>
>> Not really.
>> Do you have processing running that do read the files in 
>> /sys/kernel/debug/kvm/* ?
>>
>> If I read the arm oops message correctly it oopsed inside
>> __srcu_read_lock. there is actually nothing in there that can oops,
>> except the access to the preempt count. I am just guessing right now,
>> but maybe the preempt variable is no longer available (as the process
>> is gone). As long as a debugfs file is open, we hold a reference to
>> the kvm, which holds a reference to the mm, so the mm might be killed
>> after the process. But this is supposed to work, so maybe its something
>> different. An objdump of __srcu_read_lock might help.
>
> Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
> __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
> which must be present and is initialized during boot.
>
>
> int __srcu_read_lock(struct srcu_struct *sp)
> {
> int idx;
>
> idx = READ_ONCE(sp->completed) & 0x1;
> __this_cpu_inc(sp->per_cpu_ref->c[idx]);
> smp_mb(); /* B */  /* Avoid leaking the critical section. */
> __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
> return idx;
> }
>
> Looking at the code I have no clue why the patch does make a difference.
> Can you try to get an objdump -S for__Srcu_read_lock?
>>>
>>> Some other interesting finding below...
>>>
>>> On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/
>>>
>>> Running strace on the qemu command I use to launch the guest yields
>>> the following.
>>>
>>> [pid  5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE,
>>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000
>>> [pid  5963] 1448649724.405586 read(13, "MemTotal:   16414616
>>> kB\nMemF"..., 1024) = 1024
>>> [pid  5963] 1448649724.405699 close(13) = 0
>>> [pid  5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0
>>> [pid  5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000
>>> [pid  5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm",
>>> O_RDWR|O_CLOEXEC) = 13
>>> [pid  5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM
>>> (Cannot allocate memory)
>>
>> If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots
>> fine. I put some printk's in the kvm_create_vm_debugfs() function and
>> it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was
>> chatting with some folks from the Linaro virtualization team and they
>> mentioned that ARM is a bit special as the same PID creates two vms in
>> quick succession, the first one is a scratch vm, and the other is the
>> 'real' vm. With that bit of info, I suspect we may be trying to create
>> the debugfs directory twice, and the second time it's failing because
>> it already exists.
>
> Hmmm, with a patched QEMU that calls VM_CREATE twice it errors out on s390
> with -ENOMEM (which it should not), but it errors out gracefully.
>
> Does the attached patch avoid the crash? (guest will not start, but qemu
> should error out gracefully with ENOMEM)

Yeah. I patched my host kernel and now the qemu guest launch errors
gracefully[1].

Cheers,

Tyler

[1] http://hastebin.com/rotiropayo.mel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-30 Thread Alex Bennée

Janosch Frank  writes:

> On 11/27/2015 09:42 PM, Tyler Baker wrote:
>> On 27 November 2015 at 10:53, Tyler Baker  wrote:
>>> On 27 November 2015 at 09:08, Tyler Baker  wrote:
 On 27 November 2015 at 00:54, Christian Borntraeger
  wrote:
> On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
>> On 11/26/2015 05:17 PM, Tyler Baker wrote:
>>> Hi Christian,
>>>
>>> The kernelci.org bot recently has been reporting kvm guest boot
>>> failures[1] on various arm64 platforms in next-20151126. The bot
>>> bisected[2] the failures to the commit in -next titled "KVM: Create
>>> debugfs dir and stat files for each VM". I confirmed by reverting this
>>> commit on top of next-20151126 it resolves the boot issue.
>>>

>>
> After a quick look into qemu I guess I've found the problem:
> kvm_init creates a vm, does checking and self initialization and
> then calls kvm_arch_init. The arch initialization indirectly
> calls kvm_arm_create_scratch_host_vcpu and that's where the
> trouble begins, as it also creates a VM.
>
> My assumption was, that nobody would create multiple VMs under
> the same PID. Christian and I are working on a solution on kernel
> side.

Yeah ARM is a little weird in that respect as the scratch VM is used to
probe capabilities. There is nothing in the API that says you can't have
multiple VMs per PID so I guess a better unique identifier is needed.

--
Alex Bennée
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-30 Thread Christian Borntraeger
On 11/27/2015 09:42 PM, Tyler Baker wrote:
> On 27 November 2015 at 10:53, Tyler Baker  wrote:
>> On 27 November 2015 at 09:08, Tyler Baker  wrote:
>>> On 27 November 2015 at 00:54, Christian Borntraeger
>>>  wrote:
 On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
> On 11/26/2015 05:17 PM, Tyler Baker wrote:
>> Hi Christian,
>>
>> The kernelci.org bot recently has been reporting kvm guest boot
>> failures[1] on various arm64 platforms in next-20151126. The bot
>> bisected[2] the failures to the commit in -next titled "KVM: Create
>> debugfs dir and stat files for each VM". I confirmed by reverting this
>> commit on top of next-20151126 it resolves the boot issue.
>>
>> In this test case the host and guest are booted with the same kernel.
>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
>> and launches a guest. The host is booting fine, but when the guest is
>> launched it errors with "Failed to retrieve host CPU features!". I
>> checked the host logs, and found an "Unable to handle kernel paging
>> request" splat[3] which occurs when the guest is attempting to start.
>>
>> I scanned the patch in question but nothing obvious jumped out at me,
>> any thoughts?
>
> Not really.
> Do you have processing running that do read the files in 
> /sys/kernel/debug/kvm/* ?
>
> If I read the arm oops message correctly it oopsed inside
> __srcu_read_lock. there is actually nothing in there that can oops,
> except the access to the preempt count. I am just guessing right now,
> but maybe the preempt variable is no longer available (as the process
> is gone). As long as a debugfs file is open, we hold a reference to
> the kvm, which holds a reference to the mm, so the mm might be killed
> after the process. But this is supposed to work, so maybe its something
> different. An objdump of __srcu_read_lock might help.

 Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
 __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
 which must be present and is initialized during boot.


 int __srcu_read_lock(struct srcu_struct *sp)
 {
 int idx;

 idx = READ_ONCE(sp->completed) & 0x1;
 __this_cpu_inc(sp->per_cpu_ref->c[idx]);
 smp_mb(); /* B */  /* Avoid leaking the critical section. */
 __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
 return idx;
 }

 Looking at the code I have no clue why the patch does make a difference.
 Can you try to get an objdump -S for__Srcu_read_lock?
>>
>> Some other interesting finding below...
>>
>> On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/
>>
>> Running strace on the qemu command I use to launch the guest yields
>> the following.
>>
>> [pid  5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE,
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000
>> [pid  5963] 1448649724.405586 read(13, "MemTotal:   16414616
>> kB\nMemF"..., 1024) = 1024
>> [pid  5963] 1448649724.405699 close(13) = 0
>> [pid  5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0
>> [pid  5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000
>> [pid  5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm",
>> O_RDWR|O_CLOEXEC) = 13
>> [pid  5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM
>> (Cannot allocate memory)
> 
> If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots
> fine. I put some printk's in the kvm_create_vm_debugfs() function and
> it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was
> chatting with some folks from the Linaro virtualization team and they
> mentioned that ARM is a bit special as the same PID creates two vms in
> quick succession, the first one is a scratch vm, and the other is the
> 'real' vm. With that bit of info, I suspect we may be trying to create
> the debugfs directory twice, and the second time it's failing because
> it already exists.

Hmmm, with a patched QEMU that calls VM_CREATE twice it errors out on s390
with -ENOMEM (which it should not), but it errors out gracefully.

Does the attached patch avoid the crash? (guest will not start, but qemu
should error out gracefully with ENOMEM)



diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f7d6c8f..b26472a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -671,12 +671,16 @@ static struct kvm *kvm_create_vm(unsigned long type)
 
 	r = kvm_create_vm_debugfs(kvm);
 	if (r)
-		goto out_err;
+		goto out_mmu;
 
 	preempt_notifier_inc();
 
 	return kvm;
 
+out_mmu:
+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+	mmu_notifier_unregister(>mmu_notifier, kvm->mm);
+#endif
 out_err:
 	cleanup_srcu_struct(>irq_srcu);
 out_err_no_irq_srcu:


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-27 Thread Tyler Baker
On 27 November 2015 at 00:54, Christian Borntraeger
 wrote:
> On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
>> On 11/26/2015 05:17 PM, Tyler Baker wrote:
>>> Hi Christian,
>>>
>>> The kernelci.org bot recently has been reporting kvm guest boot
>>> failures[1] on various arm64 platforms in next-20151126. The bot
>>> bisected[2] the failures to the commit in -next titled "KVM: Create
>>> debugfs dir and stat files for each VM". I confirmed by reverting this
>>> commit on top of next-20151126 it resolves the boot issue.
>>>
>>> In this test case the host and guest are booted with the same kernel.
>>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
>>> and launches a guest. The host is booting fine, but when the guest is
>>> launched it errors with "Failed to retrieve host CPU features!". I
>>> checked the host logs, and found an "Unable to handle kernel paging
>>> request" splat[3] which occurs when the guest is attempting to start.
>>>
>>> I scanned the patch in question but nothing obvious jumped out at me,
>>> any thoughts?
>>
>> Not really.
>> Do you have processing running that do read the files in 
>> /sys/kernel/debug/kvm/* ?
>>
>> If I read the arm oops message correctly it oopsed inside
>> __srcu_read_lock. there is actually nothing in there that can oops,
>> except the access to the preempt count. I am just guessing right now,
>> but maybe the preempt variable is no longer available (as the process
>> is gone). As long as a debugfs file is open, we hold a reference to
>> the kvm, which holds a reference to the mm, so the mm might be killed
>> after the process. But this is supposed to work, so maybe its something
>> different. An objdump of __srcu_read_lock might help.
>
> Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
> __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
> which must be present and is initialized during boot.
>
>
> int __srcu_read_lock(struct srcu_struct *sp)
> {
> int idx;
>
> idx = READ_ONCE(sp->completed) & 0x1;
> __this_cpu_inc(sp->per_cpu_ref->c[idx]);
> smp_mb(); /* B */  /* Avoid leaking the critical section. */
> __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
> return idx;
> }
>
> Looking at the code I have no clue why the patch does make a difference.
> Can you try to get an objdump -S for__Srcu_read_lock?

Using next-20151126 as the base, here is the objdump[1] I came up with
for__srcu_read_lock.

Cheers,

Tyler

[1] http://hastebin.com/bifiqobola.pl
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-27 Thread Tyler Baker
On 27 November 2015 at 09:08, Tyler Baker  wrote:
> On 27 November 2015 at 00:54, Christian Borntraeger
>  wrote:
>> On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
>>> On 11/26/2015 05:17 PM, Tyler Baker wrote:
 Hi Christian,

 The kernelci.org bot recently has been reporting kvm guest boot
 failures[1] on various arm64 platforms in next-20151126. The bot
 bisected[2] the failures to the commit in -next titled "KVM: Create
 debugfs dir and stat files for each VM". I confirmed by reverting this
 commit on top of next-20151126 it resolves the boot issue.

 In this test case the host and guest are booted with the same kernel.
 The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
 and launches a guest. The host is booting fine, but when the guest is
 launched it errors with "Failed to retrieve host CPU features!". I
 checked the host logs, and found an "Unable to handle kernel paging
 request" splat[3] which occurs when the guest is attempting to start.

 I scanned the patch in question but nothing obvious jumped out at me,
 any thoughts?
>>>
>>> Not really.
>>> Do you have processing running that do read the files in 
>>> /sys/kernel/debug/kvm/* ?
>>>
>>> If I read the arm oops message correctly it oopsed inside
>>> __srcu_read_lock. there is actually nothing in there that can oops,
>>> except the access to the preempt count. I am just guessing right now,
>>> but maybe the preempt variable is no longer available (as the process
>>> is gone). As long as a debugfs file is open, we hold a reference to
>>> the kvm, which holds a reference to the mm, so the mm might be killed
>>> after the process. But this is supposed to work, so maybe its something
>>> different. An objdump of __srcu_read_lock might help.
>>
>> Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
>> __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
>> which must be present and is initialized during boot.
>>
>>
>> int __srcu_read_lock(struct srcu_struct *sp)
>> {
>> int idx;
>>
>> idx = READ_ONCE(sp->completed) & 0x1;
>> __this_cpu_inc(sp->per_cpu_ref->c[idx]);
>> smp_mb(); /* B */  /* Avoid leaking the critical section. */
>> __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
>> return idx;
>> }
>>
>> Looking at the code I have no clue why the patch does make a difference.
>> Can you try to get an objdump -S for__Srcu_read_lock?

Some other interesting finding below...

On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/

Running strace on the qemu command I use to launch the guest yields
the following.

[pid  5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000
[pid  5963] 1448649724.405586 read(13, "MemTotal:   16414616
kB\nMemF"..., 1024) = 1024
[pid  5963] 1448649724.405699 close(13) = 0
[pid  5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0
[pid  5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000
[pid  5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm",
O_RDWR|O_CLOEXEC) = 13
[pid  5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM
(Cannot allocate memory)
[pid  5963] 1448649724.406382 close(13) = 0
[pid  5963] 1448649724.406435 write(2, "Failed to retrieve host CPU
feat"..., 38Failed to retrieve host CPU features!
) = 38

Tyler
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-27 Thread Tyler Baker
On 27 November 2015 at 10:53, Tyler Baker  wrote:
> On 27 November 2015 at 09:08, Tyler Baker  wrote:
>> On 27 November 2015 at 00:54, Christian Borntraeger
>>  wrote:
>>> On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
 On 11/26/2015 05:17 PM, Tyler Baker wrote:
> Hi Christian,
>
> The kernelci.org bot recently has been reporting kvm guest boot
> failures[1] on various arm64 platforms in next-20151126. The bot
> bisected[2] the failures to the commit in -next titled "KVM: Create
> debugfs dir and stat files for each VM". I confirmed by reverting this
> commit on top of next-20151126 it resolves the boot issue.
>
> In this test case the host and guest are booted with the same kernel.
> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
> and launches a guest. The host is booting fine, but when the guest is
> launched it errors with "Failed to retrieve host CPU features!". I
> checked the host logs, and found an "Unable to handle kernel paging
> request" splat[3] which occurs when the guest is attempting to start.
>
> I scanned the patch in question but nothing obvious jumped out at me,
> any thoughts?

 Not really.
 Do you have processing running that do read the files in 
 /sys/kernel/debug/kvm/* ?

 If I read the arm oops message correctly it oopsed inside
 __srcu_read_lock. there is actually nothing in there that can oops,
 except the access to the preempt count. I am just guessing right now,
 but maybe the preempt variable is no longer available (as the process
 is gone). As long as a debugfs file is open, we hold a reference to
 the kvm, which holds a reference to the mm, so the mm might be killed
 after the process. But this is supposed to work, so maybe its something
 different. An objdump of __srcu_read_lock might help.
>>>
>>> Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
>>> __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
>>> which must be present and is initialized during boot.
>>>
>>>
>>> int __srcu_read_lock(struct srcu_struct *sp)
>>> {
>>> int idx;
>>>
>>> idx = READ_ONCE(sp->completed) & 0x1;
>>> __this_cpu_inc(sp->per_cpu_ref->c[idx]);
>>> smp_mb(); /* B */  /* Avoid leaking the critical section. */
>>> __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
>>> return idx;
>>> }
>>>
>>> Looking at the code I have no clue why the patch does make a difference.
>>> Can you try to get an objdump -S for__Srcu_read_lock?
>
> Some other interesting finding below...
>
> On the host, I do _not_ have any nodes under /sys/kernel/debug/kvm/
>
> Running strace on the qemu command I use to launch the guest yields
> the following.
>
> [pid  5963] 1448649724.405537 mmap(NULL, 65536, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6652a000
> [pid  5963] 1448649724.405586 read(13, "MemTotal:   16414616
> kB\nMemF"..., 1024) = 1024
> [pid  5963] 1448649724.405699 close(13) = 0
> [pid  5963] 1448649724.405755 munmap(0x7f6652a000, 65536) = 0
> [pid  5963] 1448649724.405947 brk(0x2552f000) = 0x2552f000
> [pid  5963] 1448649724.406148 openat(AT_FDCWD, "/dev/kvm",
> O_RDWR|O_CLOEXEC) = 13
> [pid  5963] 1448649724.406209 ioctl(13, KVM_CREATE_VM, 0) = -1 ENOMEM
> (Cannot allocate memory)

If I comment the call to kvm_create_vm_debugfs(kvm) the guest boots
fine. I put some printk's in the kvm_create_vm_debugfs() function and
it's returning -ENOMEM after it evaluates !kvm->debugfs_dentry. I was
chatting with some folks from the Linaro virtualization team and they
mentioned that ARM is a bit special as the same PID creates two vms in
quick succession, the first one is a scratch vm, and the other is the
'real' vm. With that bit of info, I suspect we may be trying to create
the debugfs directory twice, and the second time it's failing because
it already exists.

Cheers,

Tyler
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-27 Thread Christian Borntraeger
On 11/26/2015 09:47 PM, Christian Borntraeger wrote:
> On 11/26/2015 05:17 PM, Tyler Baker wrote:
>> Hi Christian,
>>
>> The kernelci.org bot recently has been reporting kvm guest boot
>> failures[1] on various arm64 platforms in next-20151126. The bot
>> bisected[2] the failures to the commit in -next titled "KVM: Create
>> debugfs dir and stat files for each VM". I confirmed by reverting this
>> commit on top of next-20151126 it resolves the boot issue.
>>
>> In this test case the host and guest are booted with the same kernel.
>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
>> and launches a guest. The host is booting fine, but when the guest is
>> launched it errors with "Failed to retrieve host CPU features!". I
>> checked the host logs, and found an "Unable to handle kernel paging
>> request" splat[3] which occurs when the guest is attempting to start.
>>
>> I scanned the patch in question but nothing obvious jumped out at me,
>> any thoughts?
> 
> Not really.
> Do you have processing running that do read the files in 
> /sys/kernel/debug/kvm/* ?
> 
> If I read the arm oops message correctly it oopsed inside 
> __srcu_read_lock. there is actually nothing in there that can oops, 
> except the access to the preempt count. I am just guessing right now,
> but maybe the preempt variable is no longer available (as the process
> is gone). As long as a debugfs file is open, we hold a reference to 
> the kvm, which holds a reference to the mm, so the mm might be killed
> after the process. But this is supposed to work, so maybe its something
> different. An objdump of __srcu_read_lock might help.

Hmm, the preempt thing is done in srcu_read_lock, but the crash is in
__srcu_read_lock. This function gets the srcu struct from mmu_notifier.c,
which must be present and is initialized during boot.


int __srcu_read_lock(struct srcu_struct *sp)
{
int idx;

idx = READ_ONCE(sp->completed) & 0x1;
__this_cpu_inc(sp->per_cpu_ref->c[idx]);
smp_mb(); /* B */  /* Avoid leaking the critical section. */
__this_cpu_inc(sp->per_cpu_ref->seq[idx]);
return idx;
}

Looking at the code I have no clue why the patch does make a difference.
Can you try to get an objdump -S for__Srcu_read_lock?






> 
> I will drop it from my tree until we understand the problem
> 
> Christian
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-26 Thread Christian Borntraeger
On 11/26/2015 05:17 PM, Tyler Baker wrote:
> Hi Christian,
> 
> The kernelci.org bot recently has been reporting kvm guest boot
> failures[1] on various arm64 platforms in next-20151126. The bot
> bisected[2] the failures to the commit in -next titled "KVM: Create
> debugfs dir and stat files for each VM". I confirmed by reverting this
> commit on top of next-20151126 it resolves the boot issue.
> 
> In this test case the host and guest are booted with the same kernel.
> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
> and launches a guest. The host is booting fine, but when the guest is
> launched it errors with "Failed to retrieve host CPU features!". I
> checked the host logs, and found an "Unable to handle kernel paging
> request" splat[3] which occurs when the guest is attempting to start.
> 
> I scanned the patch in question but nothing obvious jumped out at me,
> any thoughts?

Not really.
Do you have processing running that do read the files in 
/sys/kernel/debug/kvm/* ?

If I read the arm oops message correctly it oopsed inside 
__srcu_read_lock. there is actually nothing in there that can oops, 
except the access to the preempt count. I am just guessing right now,
but maybe the preempt variable is no longer available (as the process
is gone). As long as a debugfs file is open, we hold a reference to 
the kvm, which holds a reference to the mm, so the mm might be killed
after the process. But this is supposed to work, so maybe its something
different. An objdump of __srcu_read_lock might help.

I will drop it from my tree until we understand the problem

Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: Create debugfs dir and stat files for each VM

2015-11-26 Thread Tyler Baker
Hi Christian,

The kernelci.org bot recently has been reporting kvm guest boot
failures[1] on various arm64 platforms in next-20151126. The bot
bisected[2] the failures to the commit in -next titled "KVM: Create
debugfs dir and stat files for each VM". I confirmed by reverting this
commit on top of next-20151126 it resolves the boot issue.

In this test case the host and guest are booted with the same kernel.
The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0),
and launches a guest. The host is booting fine, but when the guest is
launched it errors with "Failed to retrieve host CPU features!". I
checked the host logs, and found an "Unable to handle kernel paging
request" splat[3] which occurs when the guest is attempting to start.

I scanned the patch in question but nothing obvious jumped out at me,
any thoughts?

Cheers,

Tyler

[1] http://kernelci.org/boot/all/job/next/kernel/next-20151126/
[2] http://hastebin.com/fuhicugate.vhdl
[3]http://hastebin.com/yicefetuho.sm
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html