Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Christoffer Dall
On Mon, Dec 19, 2011 at 1:15 AM, Antonios Motakis
a.mota...@virtualopensystems.com wrote:
 On 12/11/2011 11:25 AM, Christoffer Dall wrote:
 WARNING: This code is in development and guests do not fully boot on SMP
 hosts yet.
 Hello,

 What would still be needed to fully booted SMP? For example, are there
 identified critical sections and structures that need to be worked on,
 or there are parts that still need to be reviewed to find those? Or is
 it only a matter of fixing up any existing locking/syncing introduced in
 this patch?


You should simply start booting a UP guest on an SMP host, see where
it crashes and start tracking it down.

Same procedure for guest SMP.


 I'd like to throw some cycles on this, so I'll start by looking in this
 patch again more carefully (and guest SMP as well).

that sounds good, just go for it.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Marc Zyngier
On 19/12/11 14:57, Christoffer Dall wrote:
 On Mon, Dec 19, 2011 at 1:15 AM, Antonios Motakis
 a.mota...@virtualopensystems.com wrote:
 On 12/11/2011 11:25 AM, Christoffer Dall wrote:
 WARNING: This code is in development and guests do not fully boot on SMP
 hosts yet.
 Hello,

 What would still be needed to fully booted SMP? For example, are there
 identified critical sections and structures that need to be worked on,
 or there are parts that still need to be reviewed to find those? Or is
 it only a matter of fixing up any existing locking/syncing introduced in
 this patch?

 
 You should simply start booting a UP guest on an SMP host, see where
 it crashes and start tracking it down.

For the time being, I've yet to see UP guest crashing on SMP host. On
the model, that is...

 Same procedure for guest SMP.

That's a very different kettle of fish. I see both CPUs starting to run,
and end up with both in WFI after a while, without any interrupt pending...

I'll investigate that as soon as I can.

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Antonios Motakis

On 12/19/2011 04:19 PM, Marc Zyngier wrote:

On 19/12/11 14:57, Christoffer Dall wrote:



You should simply start booting a UP guest on an SMP host, see where
it crashes and start tracking it down.

For the time being, I've yet to see UP guest crashing on SMP host. On
the model, that is...




Last time I tried to run a guest in a 2 core model, it hanged and 
crashed after a while. Anyway, I will investigate. So I gather critical 
sections have been dealt with and it's just a matter of ironing bugs 
right now?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Marc Zyngier
On 19/12/11 15:30, Antonios Motakis wrote:
 On 12/19/2011 04:19 PM, Marc Zyngier wrote:
 On 19/12/11 14:57, Christoffer Dall wrote:


 You should simply start booting a UP guest on an SMP host, see where
 it crashes and start tracking it down.
 For the time being, I've yet to see UP guest crashing on SMP host. On
 the model, that is...


 
 Last time I tried to run a guest in a 2 core model, it hanged and 
 crashed after a while. Anyway, I will investigate. So I gather critical 
 sections have been dealt with and it's just a matter of ironing bugs 
 right now?

Depends when you tested. V4 was definitely unusable on SMP.

With my patches merged in Christoffer's v5 release, the basics should be
covered (correct SMP setup in the boot-wrapper and KVM, MPIDR
virtualization).

If you find any problem (and let's face it, I'm sure you will... ;-),
I'll be happy to investigate.

Cheers,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Christoffer Dall
On Mon, Dec 19, 2011 at 10:37 AM, Marc Zyngier marc.zyng...@arm.com wrote:
 On 19/12/11 15:30, Antonios Motakis wrote:
 On 12/19/2011 04:19 PM, Marc Zyngier wrote:
 On 19/12/11 14:57, Christoffer Dall wrote:


 You should simply start booting a UP guest on an SMP host, see where
 it crashes and start tracking it down.
 For the time being, I've yet to see UP guest crashing on SMP host. On
 the model, that is...



 Last time I tried to run a guest in a 2 core model, it hanged and
 crashed after a while. Anyway, I will investigate. So I gather critical
 sections have been dealt with and it's just a matter of ironing bugs
 right now?

 Depends when you tested. V4 was definitely unusable on SMP.

 With my patches merged in Christoffer's v5 release, the basics should be
 covered (correct SMP setup in the boot-wrapper and KVM, MPIDR
 virtualization).

 If you find any problem (and let's face it, I'm sure you will... ;-),
 I'll be happy to investigate.


I had the guest boot process crash on a page fault on the v5 series.
Consistently. I just didn't have time to investigate yet.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Antonios Motakis

On 12/19/2011 04:40 PM, Christoffer Dall wrote:

On Mon, Dec 19, 2011 at 10:37 AM, Marc Zyngiermarc.zyng...@arm.com  wrote:

On 19/12/11 15:30, Antonios Motakis wrote:

On 12/19/2011 04:19 PM, Marc Zyngier wrote:

On 19/12/11 14:57, Christoffer Dall wrote:


You should simply start booting a UP guest on an SMP host, see where
it crashes and start tracking it down.

For the time being, I've yet to see UP guest crashing on SMP host. On
the model, that is...



Last time I tried to run a guest in a 2 core model, it hanged and
crashed after a while. Anyway, I will investigate. So I gather critical
sections have been dealt with and it's just a matter of ironing bugs
right now?

Depends when you tested. V4 was definitely unusable on SMP.

With my patches merged in Christoffer's v5 release, the basics should be
covered (correct SMP setup in the boot-wrapper and KVM, MPIDR
virtualization).

If you find any problem (and let's face it, I'm sure you will... ;-),
I'll be happy to investigate.


I had the guest boot process crash on a page fault on the v5 series.
Consistently. I just didn't have time to investigate yet.


I was also running v5, however I didn't record the exact crash behavior, 
since I assumed I was having the same results as everyone. I will look 
into it now.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Marc Zyngier
On 19/12/11 15:42, Antonios Motakis wrote:
 On 12/19/2011 04:40 PM, Christoffer Dall wrote:
 On Mon, Dec 19, 2011 at 10:37 AM, Marc Zyngiermarc.zyng...@arm.com  wrote:
 On 19/12/11 15:30, Antonios Motakis wrote:
 On 12/19/2011 04:19 PM, Marc Zyngier wrote:
 On 19/12/11 14:57, Christoffer Dall wrote:

 You should simply start booting a UP guest on an SMP host, see where
 it crashes and start tracking it down.
 For the time being, I've yet to see UP guest crashing on SMP host. On
 the model, that is...


 Last time I tried to run a guest in a 2 core model, it hanged and
 crashed after a while. Anyway, I will investigate. So I gather critical
 sections have been dealt with and it's just a matter of ironing bugs
 right now?
 Depends when you tested. V4 was definitely unusable on SMP.

 With my patches merged in Christoffer's v5 release, the basics should be
 covered (correct SMP setup in the boot-wrapper and KVM, MPIDR
 virtualization).

 If you find any problem (and let's face it, I'm sure you will... ;-),
 I'll be happy to investigate.

 I had the guest boot process crash on a page fault on the v5 series.
 Consistently. I just didn't have time to investigate yet.
 
 I was also running v5, however I didn't record the exact crash behavior, 
 since I assumed I was having the same results as everyone. I will look 
 into it now.

If you manage to find a way to reliably reproduce it, I really want help
having it fixed ASAP. Kernel config and co much appreciated.

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Peter Maydell
2011/12/19 Christoffer Dall cd...@cs.columbia.edu:
 ok, just reproduced.

 Using the AEM v7 model, 2 cores, attached guest and host config.

 Host running on the v5 of the KVM/ARM branch, found here:
 https://github.com/virtualopensystems/linux-kvm-arm

 Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf

 Guest file system is a Linaro min-debug cramfs.

You should probably also say which revision of qemu you were
running...

-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Christoffer Dall
On Dec 19, 2011, at 12:19 PM, Peter Maydell peter.mayd...@linaro.org wrote:

 2011/12/19 Christoffer Dall cd...@cs.columbia.edu:
 ok, just reproduced.

 Using the AEM v7 model, 2 cores, attached guest and host config.

 Host running on the v5 of the KVM/ARM branch, found here:
 https://github.com/virtualopensystems/linux-kvm-arm

 Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf

 Guest file system is a Linaro min-debug cramfs.

 You should probably also say which revision of qemu you were
 running...

I don't think it would make a difference, but it's the upstream linaro
branch that you maintain.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Peter Maydell
On 19 December 2011 17:24, Christoffer Dall
christofferd...@christofferdall.dk wrote:
 On Dec 19, 2011, at 12:19 PM, Peter Maydell peter.mayd...@linaro.org wrote:

 2011/12/19 Christoffer Dall cd...@cs.columbia.edu:
 ok, just reproduced.

 Using the AEM v7 model, 2 cores, attached guest and host config.

 Host running on the v5 of the KVM/ARM branch, found here:
 https://github.com/virtualopensystems/linux-kvm-arm

 Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf

 Guest file system is a Linaro min-debug cramfs.

 You should probably also say which revision of qemu you were
 running...

 I don't think it would make a difference, but it's the upstream linaro
 branch that you maintain.

Well, in particular, anything predating qemu-linaro commit bd40be95d1
won't do the right thing for interrupt delivery to SMP guests.

-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-19 Thread Christoffer Dall
the problem so far is SMP hosts, but I used this revision:
c3ea1e4ac8b2ff0f13d86289562a477f0ae3fd6b

On Mon, Dec 19, 2011 at 12:36 PM, Peter Maydell
peter.mayd...@linaro.org wrote:
 On 19 December 2011 17:24, Christoffer Dall
 christofferd...@christofferdall.dk wrote:
 On Dec 19, 2011, at 12:19 PM, Peter Maydell peter.mayd...@linaro.org wrote:

 2011/12/19 Christoffer Dall cd...@cs.columbia.edu:
 ok, just reproduced.

 Using the AEM v7 model, 2 cores, attached guest and host config.

 Host running on the v5 of the KVM/ARM branch, found here:
 https://github.com/virtualopensystems/linux-kvm-arm

 Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf

 Guest file system is a Linaro min-debug cramfs.

 You should probably also say which revision of qemu you were
 running...

 I don't think it would make a difference, but it's the upstream linaro
 branch that you maintain.

 Well, in particular, anything predating qemu-linaro commit bd40be95d1
 won't do the right thing for interrupt delivery to SMP guests.

 -- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-18 Thread Antonios Motakis

On 12/11/2011 11:25 AM, Christoffer Dall wrote:

WARNING: This code is in development and guests do not fully boot on SMP
hosts yet.

Hello,

What would still be needed to fully booted SMP? For example, are there 
identified critical sections and structures that need to be worked on, 
or there are parts that still need to be reviewed to find those? Or is 
it only a matter of fixing up any existing locking/syncing introduced in 
this patch?


I'd like to throw some cycles on this, so I'll start by looking in this 
patch again more carefully (and guest SMP as well).


Antonios

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-13 Thread Christoffer Dall
On Tue, Dec 13, 2011 at 4:37 AM, Avi Kivity a...@redhat.com wrote:
 On 12/12/2011 09:36 PM, Christoffer Dall wrote:
 
  something like this:
 
  if (irq_level-level) {
        set_bit(vcpu-arch.irq_lines, bit_nr);
        smp_mb();
        wake_up_interruptible(vcpu-wq);
 
  or, smp_send_reschedule().  See kvm_vcpu_kick().
 
  An optimization: do a cmpxchg() and don't wakeup if the operation raised
  IRQ file FIQ was set (assuming that FIQ has a higher priority than IRQ).
 

 forgive my lack of experience here:

 how would you use cmpxchg exactly?

 cmpxchg is used as a fallback for general test-and-change which doesn't
 fit any other atomic operation.  In this case:

  do {
    virt_intr = ACCESS_ONCE(vcpu-virt_intr)
    new_virt_intr = virt_intr | mask

    if (new_virt_intr == virt_intr)
        return;  /* no change */

  } while (!cmpxchg(vcpu-virt_intr, virt_intr, new_virt_intr)

  /* at this point, we know the old and new values, and we know the
 update was atomic */


heh, nice. thanks.

  if (new_virt_intr == IRQ | FIQ  virt_intr == FIQ) {
      /* IRQ raised, FIQ already set */
      return;
  }


hmm, so what you want to avoid here is sending an IPI to the other CPU
in case we already have FIQ raised? But I think we have to do this
anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but
the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the
IRQ is unmasked we want to change the hypervisor virtual IRQ register
right away as to signal an IRQ immediately and if the guest masks IRQ
we still want to change the hypervisor virtual register so that the
moment the guest unmasks the IRQ, an exception is raised. The only way
to set the hypervisor register for another CPU would be to have it
take a world-switch round.

So, I think if we simply change either of these lines, we need to
signal the other CPU.

Marc, can you confirm?


 yes, FIQ takes priority over IRQ, but we need to raise the IRQ flag anyway.

 I can see that we don't have to call the wake_up function unless the
 old value was 0 (no interrupts pending == raising the first one). Is
 this what you mean? If so, I can't seem to wrap my head around how to
 accomplish this elegantly with cmpxchg...?

 Then, another issue. I am going to need to two fields and then a
 memcpy, since I don't want to be reading an atomic value from my
 world-switch code. (I would need this copy even with a spinlock).
 Agree?

 Don't follow.  Why a memcpy?  why don't you want to read an atomic value
 during world switch?


if it was declared as an atomic_t, since that's theoretically opaque,
could change to 64-bit and so on, I thought it would be ugly to read
that in the world-switch assembler code. But using atomic bitops on a
standard long field, we should be fine.

 But, if I am doing atomic bitops on an unsigned long field, how do I
 read that (or test two bits at once) atomically?

 Reads (and writes) are always atomic.  Its read-modify-write that needs
 special treatment.

embarrassing. I actually thought read/writes of a word could be
partial to the byte between SMP nodes, but it turns out, and as you
say, they cannot. Got it.

Thanks again!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-13 Thread Avi Kivity
On 12/13/2011 03:36 PM, Christoffer Dall wrote:
   if (new_virt_intr == IRQ | FIQ  virt_intr == FIQ) {
   /* IRQ raised, FIQ already set */
   return;
   }
 

 hmm, so what you want to avoid here is sending an IPI to the other CPU
 in case we already have FIQ raised? But I think we have to do this
 anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but
 the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the
 IRQ is unmasked we want to change the hypervisor virtual IRQ register
 right away as to signal an IRQ immediately and if the guest masks IRQ
 we still want to change the hypervisor virtual register so that the
 moment the guest unmasks the IRQ, an exception is raised. The only way
 to set the hypervisor register for another CPU would be to have it
 take a world-switch round.

Ah, if the register is virtualized, indeed you need to signal immediately.

  But, if I am doing atomic bitops on an unsigned long field, how do I
  read that (or test two bits at once) atomically?
 
  Reads (and writes) are always atomic.  Its read-modify-write that needs
  special treatment.
 
 embarrassing. I actually thought read/writes of a word could be
 partial to the byte between SMP nodes, but it turns out, and as you
 say, they cannot. Got it.

Read Documentation/atomic_ops.txt and Documentation/memory-barriers.txt,
it's a tricky subject.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-13 Thread Marc Zyngier
On 13/12/11 13:36, Christoffer Dall wrote:
 On Tue, Dec 13, 2011 at 4:37 AM, Avi Kivity a...@redhat.com wrote:
  if (new_virt_intr == IRQ | FIQ  virt_intr == FIQ) {
  /* IRQ raised, FIQ already set */
  return;
  }

 
 hmm, so what you want to avoid here is sending an IPI to the other CPU
 in case we already have FIQ raised? But I think we have to do this
 anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but
 the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the
 IRQ is unmasked we want to change the hypervisor virtual IRQ register
 right away as to signal an IRQ immediately and if the guest masks IRQ
 we still want to change the hypervisor virtual register so that the
 moment the guest unmasks the IRQ, an exception is raised. The only way
 to set the hypervisor register for another CPU would be to have it
 take a world-switch round.
 
 So, I think if we simply change either of these lines, we need to
 signal the other CPU.
 
 Marc, can you confirm?

We definitely need to signal the CPU an IRQ is pending. The VGIC may
help us here, as we can access another CPU's VGIC List Registers (via
the processor-specific base address of the Virtual interface control).
That would save sending an IPI to the target CPU.

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-13 Thread Christoffer Dall
On Dec 13, 2011, at 9:17 AM, Avi Kivity a...@redhat.com wrote:

 On 12/13/2011 03:36 PM, Christoffer Dall wrote:
 if (new_virt_intr == IRQ | FIQ  virt_intr == FIQ) {
 /* IRQ raised, FIQ already set */
 return;
 }


 hmm, so what you want to avoid here is sending an IPI to the other CPU
 in case we already have FIQ raised? But I think we have to do this
 anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but
 the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the
 IRQ is unmasked we want to change the hypervisor virtual IRQ register
 right away as to signal an IRQ immediately and if the guest masks IRQ
 we still want to change the hypervisor virtual register so that the
 moment the guest unmasks the IRQ, an exception is raised. The only way
 to set the hypervisor register for another CPU would be to have it
 take a world-switch round.

 Ah, if the register is virtualized, indeed you need to signal immediately.

 But, if I am doing atomic bitops on an unsigned long field, how do I
 read that (or test two bits at once) atomically?

 Reads (and writes) are always atomic.  Its read-modify-write that needs
 special treatment.

 embarrassing. I actually thought read/writes of a word could be
 partial to the byte between SMP nodes, but it turns out, and as you
 say, they cannot. Got it.

 Read Documentation/atomic_ops.txt and Documentation/memory-barriers.txt,
 it's a tricky subject.

Actually I did (yes the memory barriers hurt my brain), but I kind of
missed the most basic level of memory atomicity - there was a nice
section about this in the ARM arm though.

Thanks for helping out on all this!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-12 Thread Christoffer Dall
On Mon, Dec 12, 2011 at 12:56 PM, Avi Kivity a...@redhat.com wrote:
 On 12/12/2011 07:37 PM, Christoffer Dall wrote:
 
  This looks overly complicated with two levels of locks.  x86 gets by
  with no locks, and a much more complicated interrupt architecture.
 
  My recommendation is:
   wait_for_interrupts is managed solely by the vcpu thread
   KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type,
  then IPI/wakeups the vcpu to make it examine both wait_for_interrupts
  and virt_irq.
 
 
 this sounds pretty good to me.

 something like this:

 if (irq_level-level) {
       set_bit(vcpu-arch.irq_lines, bit_nr);
       smp_mb();
       wake_up_interruptible(vcpu-wq);

 or, smp_send_reschedule().  See kvm_vcpu_kick().

 An optimization: do a cmpxchg() and don't wakeup if the operation raised
 IRQ file FIQ was set (assuming that FIQ has a higher priority than IRQ).

 } else
       clear_bit(vcpu-arch.irq_lines, bit_nr);


 and the vcpu thread would clear the wait_for_interrupts flag if it
 ever sees the mask field be non-zero?

 Yes.  This is what x86 does, except it's a lot more complicated.

  @@ -522,47 +573,42 @@ static int init_hyp_mode(void)
                return -ENOMEM;
 
        /*
  -      * Allocate stack page for Hypervisor-mode
  +      * Allocate stack pages for Hypervisor-mode
         */
  -     kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL);
  -     if (!kvm_arm_hyp_stack_page) {
  -             err = -ENOMEM;
  -             goto out_free_pgd;
  -     }
  +     for_each_possible_cpu(cpu) {
  +             void *stack_page;
 
  -     hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE;
  +             stack_page = (void *)__get_free_page(GFP_KERNEL);
 
  Best to allocate this (and other per-cpu state) on the cpu's node.
 

 why, for performance reasons?

 Yes, I'm assuming that all multi-socket A15s will be numa?


I have no idea. Marc, Peter, Catalin?

 The code get slightly more complicated,
 since we have to pass the return value through the argument so we have
 to pass an opaque pointer to the struct or something like that.

 Don't see why, just use alloc_pages_node().

got it, I thought you wanted to issue the actual allocation from each
cpu as to parallelize the work. Now I understand.

 
  +     for_each_online_cpu(cpu) {
  +             smp_call_function_single(cpu, cpu_init_hyp_mode,
  +                                      (void *)(long)init_phys_addr, 1);
  +     }
 
  Need similar code for cpu hotplug.  See kvm_cpu_hotplug() and
  kvm_arch_hardware_enable() which do all this for you.
 
 so just to be sure, this will only be called for cpus that are
 hotplugged right? we still call the cpu_init_hyp_mode for each cpu
 that's online at this point.

 The infrastructure will call kvm_arch_hardware_enable() for all
 currently online cpus and any future hotplugged cpu.  Just follow
 kvm_init() and fill in the arch callbacks.  You do have a call to
 kvm_init() somewhere, yes?

 Do we need some locking to make sure the two don't overlap (like
 should I grab the kvm_lock here)?

 Let kvm_init() do the driving and relax.

This sounds good (actually looking into this was on my todo list, so
now is a good time I guess).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-12 Thread Avi Kivity
On 12/11/2011 12:25 PM, Christoffer Dall wrote:
 In order to support KVM on a SMP host, it is necessary to initialize the
 hypervisor on all CPUs, mostly by making sure each CPU gets its own
 hypervisor stack and runs the HYP init code.

 We also take care of some missing locking of modifications to the
 hypervisor page tables and ensure synchronized consistency between
 virtual IRQ masks and wait_for_interrupt flags on the VPUs.

 Note that this code doesn't handle CPU hotplug yet.
 Note that this code doesn't support SMP guests.

 WARNING: This code is in development and guests do not fully boot on SMP
 hosts yet.

Damn, I just reviewed all that breakage.

  
   /* Misc. fields */
 + spinlock_t irq_lock;
 + u32 virt_irq;   /* HCR exception mask */
   u32 wait_for_interrupts;

Better to use atomics, IMO.

 @@ -464,13 +466,27 @@ static int kvm_arch_vm_ioctl_irq_line(struct kvm *kvm,
  
   trace_kvm_irq_line(irq_level-irq % 2, irq_level-level, vcpu_idx);
  
 + spin_lock(vcpu-arch.irq_lock);
   if (irq_level-level) {
   vcpu-arch.virt_irq |= mask;
 +
 + /*
 +  * Note that we grab the wq.lock before clearing the wfi flag
 +  * since this ensures that a concurrent call to kvm_vcpu_block
 +  * will either sleep before we grab the lock, in which case we
 +  * wake it up, or will never sleep due to
 +  * kvm_arch_vcpu_runnable being true (iow. this avoids having
 +  * to grab the irq_lock in kvm_arch_vcpu_runnable).
 +  */
 + spin_lock(vcpu-wq.lock);
   vcpu-arch.wait_for_interrupts = 0;
 +
   if (waitqueue_active(vcpu-wq))
 - wake_up_interruptible(vcpu-wq);
 + __wake_up_locked(vcpu-wq, TASK_INTERRUPTIBLE);
 + spin_unlock(vcpu-wq.lock);
   } else
   vcpu-arch.virt_irq = ~mask;
 + spin_unlock(vcpu-arch.irq_lock);

This looks overly complicated with two levels of locks.  x86 gets by
with no locks, and a much more complicated interrupt architecture.

My recommendation is:
  wait_for_interrupts is managed solely by the vcpu thread
  KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type,
then IPI/wakeups the vcpu to make it examine both wait_for_interrupts
and virt_irq.


 +
 +static void cpu_init_hyp_mode(void *vector)
 +{
 + unsigned long hyp_stack_ptr;
 + void *stack_page;
 +
 + stack_page = __get_cpu_var(kvm_arm_hyp_stack_page);
 + hyp_stack_ptr = (unsigned long)stack_page + PAGE_SIZE;
 +
 + cpu_set_vector(vector);
 +
 + /*
 +  * Call initialization code
 +  */
 + asm volatile (
 + movr0, %[pgd_ptr]\n\t
 + movr1, %[stack_ptr]\n\t
 + hvc#0\n\t : :
 + [pgd_ptr] r (virt_to_phys(kvm_hyp_pgd)),
 + [stack_ptr] r (hyp_stack_ptr) :
 + r0, r1);
 +}

(slightly nicer is to allocate hyp_stack_ptr and pgd_ptr to register
asm(r0) and register asm(r1) to avoid the extra mov instruction)

 @@ -522,47 +573,42 @@ static int init_hyp_mode(void)
   return -ENOMEM;
  
   /*
 -  * Allocate stack page for Hypervisor-mode
 +  * Allocate stack pages for Hypervisor-mode
*/
 - kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL);
 - if (!kvm_arm_hyp_stack_page) {
 - err = -ENOMEM;
 - goto out_free_pgd;
 - }
 + for_each_possible_cpu(cpu) {
 + void *stack_page;
  
 - hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE;
 + stack_page = (void *)__get_free_page(GFP_KERNEL);

Best to allocate this (and other per-cpu state) on the cpu's node.

 + if (!stack_page) {
 + err = -ENOMEM;
 + goto out_free_pgd;
 + }
 +
 + per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
 + }
  
   init_phys_addr = virt_to_phys(__kvm_hyp_init);
   init_end_phys_addr = virt_to_phys(__kvm_hyp_init_end);
 + BUG_ON(init_phys_addr  0x1f);
  
   /*
 -  * Create identity mapping
 +  * Create identity mapping for the init code.
*/
   hyp_identity_mapping_add(kvm_hyp_pgd,
(unsigned long)init_phys_addr,
(unsigned long)init_end_phys_addr);
  
 + for_each_online_cpu(cpu) {
 + smp_call_function_single(cpu, cpu_init_hyp_mode,
 +  (void *)(long)init_phys_addr, 1);
 + }

Need similar code for cpu hotplug.  See kvm_cpu_hotplug() and
kvm_arch_hardware_enable() which do all this for you.


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-12 Thread Christoffer Dall
On Mon, Dec 12, 2011 at 9:30 AM, Avi Kivity a...@redhat.com wrote:
 On 12/11/2011 12:25 PM, Christoffer Dall wrote:
 In order to support KVM on a SMP host, it is necessary to initialize the
 hypervisor on all CPUs, mostly by making sure each CPU gets its own
 hypervisor stack and runs the HYP init code.

 We also take care of some missing locking of modifications to the
 hypervisor page tables and ensure synchronized consistency between
 virtual IRQ masks and wait_for_interrupt flags on the VPUs.

 Note that this code doesn't handle CPU hotplug yet.
 Note that this code doesn't support SMP guests.

 WARNING: This code is in development and guests do not fully boot on SMP
 hosts yet.

 Damn, I just reviewed all that breakage.


so sorry...,


       /* Misc. fields */
 +     spinlock_t irq_lock;
 +     u32 virt_irq;           /* HCR exception mask */
       u32 wait_for_interrupts;

 Better to use atomics, IMO.

hmm, yeah, I guess the way to do it would be to have two fields - one
atomic field used for interrupt injection, which is read atomically in
the C-code into a plain u32 variable, which can then be copied
directly onto the hardware during the world-switch...


 @@ -464,13 +466,27 @@ static int kvm_arch_vm_ioctl_irq_line(struct kvm *kvm,

       trace_kvm_irq_line(irq_level-irq % 2, irq_level-level, vcpu_idx);

 +     spin_lock(vcpu-arch.irq_lock);
       if (irq_level-level) {
               vcpu-arch.virt_irq |= mask;
 +
 +             /*
 +              * Note that we grab the wq.lock before clearing the wfi flag
 +              * since this ensures that a concurrent call to kvm_vcpu_block
 +              * will either sleep before we grab the lock, in which case we
 +              * wake it up, or will never sleep due to
 +              * kvm_arch_vcpu_runnable being true (iow. this avoids having
 +              * to grab the irq_lock in kvm_arch_vcpu_runnable).
 +              */
 +             spin_lock(vcpu-wq.lock);
               vcpu-arch.wait_for_interrupts = 0;
 +
               if (waitqueue_active(vcpu-wq))
 -                     wake_up_interruptible(vcpu-wq);
 +                     __wake_up_locked(vcpu-wq, TASK_INTERRUPTIBLE);
 +             spin_unlock(vcpu-wq.lock);
       } else
               vcpu-arch.virt_irq = ~mask;
 +     spin_unlock(vcpu-arch.irq_lock);

 This looks overly complicated with two levels of locks.  x86 gets by
 with no locks, and a much more complicated interrupt architecture.

 My recommendation is:
  wait_for_interrupts is managed solely by the vcpu thread
  KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type,
 then IPI/wakeups the vcpu to make it examine both wait_for_interrupts
 and virt_irq.


this sounds pretty good to me.

something like this:

if (irq_level-level) {
set_bit(vcpu-arch.irq_lines, bit_nr);
smp_mb();
wake_up_interruptible(vcpu-wq);
} else
clear_bit(vcpu-arch.irq_lines, bit_nr);


and the vcpu thread would clear the wait_for_interrupts flag if it
ever sees the mask field be non-zero?

 +
 +static void cpu_init_hyp_mode(void *vector)
 +{
 +     unsigned long hyp_stack_ptr;
 +     void *stack_page;
 +
 +     stack_page = __get_cpu_var(kvm_arm_hyp_stack_page);
 +     hyp_stack_ptr = (unsigned long)stack_page + PAGE_SIZE;
 +
 +     cpu_set_vector(vector);
 +
 +     /*
 +      * Call initialization code
 +      */
 +     asm volatile (
 +             mov    r0, %[pgd_ptr]\n\t
 +             mov    r1, %[stack_ptr]\n\t
 +             hvc    #0\n\t : :
 +             [pgd_ptr] r (virt_to_phys(kvm_hyp_pgd)),
 +             [stack_ptr] r (hyp_stack_ptr) :
 +             r0, r1);
 +}

 (slightly nicer is to allocate hyp_stack_ptr and pgd_ptr to register
 asm(r0) and register asm(r1) to avoid the extra mov instruction)


I agree

 @@ -522,47 +573,42 @@ static int init_hyp_mode(void)
               return -ENOMEM;

       /*
 -      * Allocate stack page for Hypervisor-mode
 +      * Allocate stack pages for Hypervisor-mode
        */
 -     kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL);
 -     if (!kvm_arm_hyp_stack_page) {
 -             err = -ENOMEM;
 -             goto out_free_pgd;
 -     }
 +     for_each_possible_cpu(cpu) {
 +             void *stack_page;

 -     hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE;
 +             stack_page = (void *)__get_free_page(GFP_KERNEL);

 Best to allocate this (and other per-cpu state) on the cpu's node.


why, for performance reasons? The code get slightly more complicated,
since we have to pass the return value through the argument so we have
to pass an opaque pointer to the struct or something like that.

 +             if (!stack_page) {
 +                     err = -ENOMEM;
 +                     goto out_free_pgd;
 +             }
 +
 +             per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
 +     }

       init_phys_addr = virt_to_phys(__kvm_hyp_init);
       init_end_phys_addr = 

Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-12 Thread Avi Kivity
On 12/12/2011 07:37 PM, Christoffer Dall wrote:
 
  This looks overly complicated with two levels of locks.  x86 gets by
  with no locks, and a much more complicated interrupt architecture.
 
  My recommendation is:
   wait_for_interrupts is managed solely by the vcpu thread
   KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type,
  then IPI/wakeups the vcpu to make it examine both wait_for_interrupts
  and virt_irq.
 
 
 this sounds pretty good to me.

 something like this:

 if (irq_level-level) {
   set_bit(vcpu-arch.irq_lines, bit_nr);
   smp_mb();
   wake_up_interruptible(vcpu-wq);

or, smp_send_reschedule().  See kvm_vcpu_kick().

An optimization: do a cmpxchg() and don't wakeup if the operation raised
IRQ file FIQ was set (assuming that FIQ has a higher priority than IRQ).

 } else
   clear_bit(vcpu-arch.irq_lines, bit_nr);


 and the vcpu thread would clear the wait_for_interrupts flag if it
 ever sees the mask field be non-zero?

Yes.  This is what x86 does, except it's a lot more complicated.

  @@ -522,47 +573,42 @@ static int init_hyp_mode(void)
return -ENOMEM;
 
/*
  -  * Allocate stack page for Hypervisor-mode
  +  * Allocate stack pages for Hypervisor-mode
 */
  - kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL);
  - if (!kvm_arm_hyp_stack_page) {
  - err = -ENOMEM;
  - goto out_free_pgd;
  - }
  + for_each_possible_cpu(cpu) {
  + void *stack_page;
 
  - hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE;
  + stack_page = (void *)__get_free_page(GFP_KERNEL);
 
  Best to allocate this (and other per-cpu state) on the cpu's node.
 

 why, for performance reasons? 

Yes, I'm assuming that all multi-socket A15s will be numa?

 The code get slightly more complicated,
 since we have to pass the return value through the argument so we have
 to pass an opaque pointer to the struct or something like that.

Don't see why, just use alloc_pages_node().

 
  + for_each_online_cpu(cpu) {
  + smp_call_function_single(cpu, cpu_init_hyp_mode,
  +  (void *)(long)init_phys_addr, 1);
  + }
 
  Need similar code for cpu hotplug.  See kvm_cpu_hotplug() and
  kvm_arch_hardware_enable() which do all this for you.
 
 so just to be sure, this will only be called for cpus that are
 hotplugged right? we still call the cpu_init_hyp_mode for each cpu
 that's online at this point.

The infrastructure will call kvm_arch_hardware_enable() for all
currently online cpus and any future hotplugged cpu.  Just follow
kvm_init() and fill in the arch callbacks.  You do have a call to
kvm_init() somewhere, yes?

 Do we need some locking to make sure the two don't overlap (like
 should I grab the kvm_lock here)?

Let kvm_init() do the driving and relax.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 11/13] ARM: KVM: Support SMP hosts

2011-12-11 Thread Christoffer Dall
In order to support KVM on a SMP host, it is necessary to initialize the
hypervisor on all CPUs, mostly by making sure each CPU gets its own
hypervisor stack and runs the HYP init code.

We also take care of some missing locking of modifications to the
hypervisor page tables and ensure synchronized consistency between
virtual IRQ masks and wait_for_interrupt flags on the VPUs.

Note that this code doesn't handle CPU hotplug yet.
Note that this code doesn't support SMP guests.

WARNING: This code is in development and guests do not fully boot on SMP
hosts yet.

Signed-off-by: Marc Zyngier marc.zyng...@arm.com
Signed-off-by: Christoffer Dall c.d...@virtualopensystems.com
---
 arch/arm/include/asm/kvm_host.h |4 -
 arch/arm/include/asm/kvm_mmu.h  |1 
 arch/arm/kvm/arm.c  |  175 +++
 arch/arm/kvm/emulate.c  |2 
 arch/arm/kvm/mmu.c  |9 ++
 5 files changed, 114 insertions(+), 77 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 86f6cf1..a0ffbe8 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -78,8 +78,6 @@ struct kvm_vcpu_arch {
u32 c13_TID_PRIV;   /* Thread ID, Priveleged */
} cp15;
 
-   u32 virt_irq;   /* HCR exception mask */
-
/* Exception Information */
u32 hsr;/* Hyp Syndrom Register */
u32 hdfar;  /* Hyp Data Fault Address Register */
@@ -92,6 +90,8 @@ struct kvm_vcpu_arch {
u32 mmio_rd;
 
/* Misc. fields */
+   spinlock_t irq_lock;
+   u32 virt_irq;   /* HCR exception mask */
u32 wait_for_interrupts;
 };
 
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index e82eae9..917edd7 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -28,6 +28,7 @@
 #define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
 
 extern pgd_t *kvm_hyp_pgd;
+extern struct mutex kvm_hyp_pgd_mutex;
 
 int create_hyp_mappings(pgd_t *hyp_pgd, void *from, void *to);
 void free_hyp_pmds(pgd_t *hyp_pgd);
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 00215a1..6e384e2 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -61,7 +61,7 @@ void __kvm_print_msg(char *fmt, ...)
spin_unlock(__tmp_log_lock);
 }
 
-static void *kvm_arm_hyp_stack_page;
+static DEFINE_PER_CPU(void *, kvm_arm_hyp_stack_page);
 
 /* The VMID used in the VTTBR */
 #define VMID_SIZE (18)
@@ -257,6 +257,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
unsigned long cpsr;
unsigned long sctlr;
 
+   spin_lock_init(vcpu-arch.irq_lock);
+
/* Init execution CPSR */
asm volatile (mrs  %[cpsr], cpsr :
[cpsr] =r (cpsr));
@@ -464,13 +466,27 @@ static int kvm_arch_vm_ioctl_irq_line(struct kvm *kvm,
 
trace_kvm_irq_line(irq_level-irq % 2, irq_level-level, vcpu_idx);
 
+   spin_lock(vcpu-arch.irq_lock);
if (irq_level-level) {
vcpu-arch.virt_irq |= mask;
+
+   /*
+* Note that we grab the wq.lock before clearing the wfi flag
+* since this ensures that a concurrent call to kvm_vcpu_block
+* will either sleep before we grab the lock, in which case we
+* wake it up, or will never sleep due to
+* kvm_arch_vcpu_runnable being true (iow. this avoids having
+* to grab the irq_lock in kvm_arch_vcpu_runnable).
+*/
+   spin_lock(vcpu-wq.lock);
vcpu-arch.wait_for_interrupts = 0;
+
if (waitqueue_active(vcpu-wq))
-   wake_up_interruptible(vcpu-wq);
+   __wake_up_locked(vcpu-wq, TASK_INTERRUPTIBLE);
+   spin_unlock(vcpu-wq.lock);
} else
vcpu-arch.virt_irq = ~mask;
+   spin_unlock(vcpu-arch.irq_lock);
 
return 0;
 }
@@ -505,14 +521,49 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
 }
 
+static void cpu_set_vector(void *vector)
+{
+   /*
+* Set the HVBAR
+*/
+   asm volatile (
+   movr0, %[vector_ptr]\n\t
+   ldrr7, =SMCHYP_HVBAR_W\n\t
+   smc#0\n\t : :
+   [vector_ptr] r (vector) :
+   r0, r7);
+}
+
+static void cpu_init_hyp_mode(void *vector)
+{
+   unsigned long hyp_stack_ptr;
+   void *stack_page;
+
+   stack_page = __get_cpu_var(kvm_arm_hyp_stack_page);
+   hyp_stack_ptr = (unsigned long)stack_page + PAGE_SIZE;
+
+   cpu_set_vector(vector);
+
+   /*
+* Call initialization code
+*/
+   asm volatile (
+   movr0, %[pgd_ptr]\n\t
+   movr1, %[stack_ptr]\n\t
+   hvc#0\n\t : :
+   [pgd_ptr] r (virt_to_phys(kvm_hyp_pgd)),
+   [stack_ptr] r