subject:"\[PATCH 0\/2\] powerpc\/kvm\: Enable running guests on RT Linux"

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-27 Thread Purcareata Bogdan


On 24.04.2015 00:26, Scott Wood wrote:

On Thu, 2015-04-23 at 15:31 +0300, Purcareata Bogdan wrote:

On 23.04.2015 03:30, Scott Wood wrote:

On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote:

On 21.04.2015 03:52, Scott Wood wrote:

On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:

There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner
function is kvmppc_set_epr, which is a static inline. Removing the static inline
yields a compiler crash (Segmentation fault (core dumped) -
scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed),
but that's a different story, so I just let it be for now. Point is the time may
include other work after the lock has been released, but before the function
actually returned. I noticed this was the case for .kvm_set_msi, which could
work up to 90 ms, not actually under the lock. This made me change what I'm
looking at.


kvm_set_msi does pretty much nothing outside the lock -- I suspect
you're measuring an interrupt that happened as soon as the lock was
released.


That's exactly right. I've seen things like a timer interrupt occuring right
after the spinlock_irqrestore, but before kvm_set_msi actually returned.

[...]


Or perhaps a different stress scenario involving a lot of VCPUs
and external interrupts?


You could instrument the MPIC code to find out how many loop iterations
you maxed out on, and compare that to the theoretical maximum.


Numbers are pretty low, and I'll try to explain based on my observations.

The problematic section in openpic_update_irq is this [1], since it loops
through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops
through all pending interrupts for a VCPU [2].

The guest interfaces are virtio-vhostnet, which are based on MSI
(/proc/interrupts in guest shows they are MSI). For external interrupts to the
guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized),
so [1] will go on and deliver the interrupt directly and unicast (no VCPUs 
loop).

I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts
are actually pending for the destination VCPU. At most, there were 3 interrupts
- n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that
guest virtio interrupts are cascaded over 1 or a couple of shared MSI 
interrupts.

So worst case, in this scenario, was checking the priorities for 3 pending
interrupts for 1 VCPU. Something like this (some of my prints included):

[61010.582033] openpic_update_irq: destmask 1 last_cpu 0
[61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ
[61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1
[61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1
[61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1
[61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1

It would be really helpful to get your comments regarding whether these are
realistical number for everyday use, or they are relevant only to this
particular scenario.


RT isn't about realistic numbers for everyday use.  It's about worst
cases.


- Can these interrupts be used in directed delivery, so that the destination
mask can include multiple VCPUs?


The Freescale MPIC does not support multiple destinations for most
interrupts, but the (non-FSL-specific) emulation code appears to allow
it.


   The MPIC manual states that timer and IPI
interrupts are supported for directed delivery, altough I'm not sure how much of
this is used in the emulation. I know that kvmppc uses the decrementer outside
of the MPIC.

- How are virtio interrupts cascaded over the shared MSI interrupts?
/proc/device-tree/soc@e000/msi@41600/interrupts in the guest shows 8 values
- 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is
that correct?


It looks like that's currently the case, but actual hardware supports
more than that, so it's possible (albeit unlikely any time soon) that
the emulation eventually does as well.

But it's possible to have interrupts other than MSIs...


Right.

So given that the raw spinlock conversion is not suitable for all the scenarios
supported by the OpenPIC emulation, is it ok that my next step would be to send
a patch containing both the raw spinlock conversion and a mandatory disable of
the in-kernel MPIC? This is actually the last conclusion we came up with some
time ago, but I guess it was good to get some more insight on how things
actually work (at least for me).


Fine with me.  Have you given any thought to ways to restructure the
code to eliminate the problem?


My first thought would be to create a separate lock for each VCPU pending 
interrupts queue, so that we make the whole openpic_irq_update more granular. 
However, this is just a very preliminary thought. Before I can come up with 
anything worthy of consideration, I must read the OpenPIC specification and the 
current KVM emulated OpenPIC implementation thoroughly. I currently have

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-23 Thread Scott Wood

On Thu, 2015-04-23 at 15:31 +0300, Purcareata Bogdan wrote:
 On 23.04.2015 03:30, Scott Wood wrote:
  On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote:
  On 21.04.2015 03:52, Scott Wood wrote:
  On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:
  There was a weird situation for .kvmppc_mpic_set_epr - its corresponding 
  inner
  function is kvmppc_set_epr, which is a static inline. Removing the 
  static inline
  yields a compiler crash (Segmentation fault (core dumped) -
  scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' 
  failed),
  but that's a different story, so I just let it be for now. Point is the 
  time may
  include other work after the lock has been released, but before the 
  function
  actually returned. I noticed this was the case for .kvm_set_msi, which 
  could
  work up to 90 ms, not actually under the lock. This made me change what 
  I'm
  looking at.
 
  kvm_set_msi does pretty much nothing outside the lock -- I suspect
  you're measuring an interrupt that happened as soon as the lock was
  released.
 
  That's exactly right. I've seen things like a timer interrupt occuring 
  right
  after the spinlock_irqrestore, but before kvm_set_msi actually returned.
 
  [...]
 
 Or perhaps a different stress scenario involving a lot of VCPUs
  and external interrupts?
 
  You could instrument the MPIC code to find out how many loop iterations
  you maxed out on, and compare that to the theoretical maximum.
 
  Numbers are pretty low, and I'll try to explain based on my observations.
 
  The problematic section in openpic_update_irq is this [1], since it loops
  through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops
  through all pending interrupts for a VCPU [2].
 
  The guest interfaces are virtio-vhostnet, which are based on MSI
  (/proc/interrupts in guest shows they are MSI). For external interrupts to 
  the
  guest, the irq_source destmask is currently 0, and last_cpu is 0 
  (unitialized),
  so [1] will go on and deliver the interrupt directly and unicast (no VCPUs 
  loop).
 
  I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many 
  interrupts
  are actually pending for the destination VCPU. At most, there were 3 
  interrupts
  - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand 
  that
  guest virtio interrupts are cascaded over 1 or a couple of shared MSI 
  interrupts.
 
  So worst case, in this scenario, was checking the priorities for 3 pending
  interrupts for 1 VCPU. Something like this (some of my prints included):
 
  [61010.582033] openpic_update_irq: destmask 1 last_cpu 0
  [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this 
  IRQ
  [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1
  [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1
  [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1
  [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1
 
  It would be really helpful to get your comments regarding whether these are
  realistical number for everyday use, or they are relevant only to this
  particular scenario.
 
  RT isn't about realistic numbers for everyday use.  It's about worst
  cases.
 
  - Can these interrupts be used in directed delivery, so that the 
  destination
  mask can include multiple VCPUs?
 
  The Freescale MPIC does not support multiple destinations for most
  interrupts, but the (non-FSL-specific) emulation code appears to allow
  it.
 
The MPIC manual states that timer and IPI
  interrupts are supported for directed delivery, altough I'm not sure how 
  much of
  this is used in the emulation. I know that kvmppc uses the decrementer 
  outside
  of the MPIC.
 
  - How are virtio interrupts cascaded over the shared MSI interrupts?
  /proc/device-tree/soc@e000/msi@41600/interrupts in the guest shows 8 
  values
  - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, 
  is
  that correct?
 
  It looks like that's currently the case, but actual hardware supports
  more than that, so it's possible (albeit unlikely any time soon) that
  the emulation eventually does as well.
 
  But it's possible to have interrupts other than MSIs...
 
 Right.
 
 So given that the raw spinlock conversion is not suitable for all the 
 scenarios 
 supported by the OpenPIC emulation, is it ok that my next step would be to 
 send 
 a patch containing both the raw spinlock conversion and a mandatory disable 
 of 
 the in-kernel MPIC? This is actually the last conclusion we came up with some 
 time ago, but I guess it was good to get some more insight on how things 
 actually work (at least for me).

Fine with me.  Have you given any thought to ways to restructure the
code to eliminate the problem?

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-23 Thread Purcareata Bogdan


On 23.04.2015 03:30, Scott Wood wrote:

On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote:

On 21.04.2015 03:52, Scott Wood wrote:

On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:

There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner
function is kvmppc_set_epr, which is a static inline. Removing the static inline
yields a compiler crash (Segmentation fault (core dumped) -
scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed),
but that's a different story, so I just let it be for now. Point is the time may
include other work after the lock has been released, but before the function
actually returned. I noticed this was the case for .kvm_set_msi, which could
work up to 90 ms, not actually under the lock. This made me change what I'm
looking at.


kvm_set_msi does pretty much nothing outside the lock -- I suspect
you're measuring an interrupt that happened as soon as the lock was
released.


That's exactly right. I've seen things like a timer interrupt occuring right
after the spinlock_irqrestore, but before kvm_set_msi actually returned.

[...]


   Or perhaps a different stress scenario involving a lot of VCPUs
and external interrupts?


You could instrument the MPIC code to find out how many loop iterations
you maxed out on, and compare that to the theoretical maximum.


Numbers are pretty low, and I'll try to explain based on my observations.

The problematic section in openpic_update_irq is this [1], since it loops
through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops
through all pending interrupts for a VCPU [2].

The guest interfaces are virtio-vhostnet, which are based on MSI
(/proc/interrupts in guest shows they are MSI). For external interrupts to the
guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized),
so [1] will go on and deliver the interrupt directly and unicast (no VCPUs 
loop).

I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts
are actually pending for the destination VCPU. At most, there were 3 interrupts
- n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that
guest virtio interrupts are cascaded over 1 or a couple of shared MSI 
interrupts.

So worst case, in this scenario, was checking the priorities for 3 pending
interrupts for 1 VCPU. Something like this (some of my prints included):

[61010.582033] openpic_update_irq: destmask 1 last_cpu 0
[61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ
[61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1
[61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1
[61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1
[61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1

It would be really helpful to get your comments regarding whether these are
realistical number for everyday use, or they are relevant only to this
particular scenario.


RT isn't about realistic numbers for everyday use.  It's about worst
cases.


- Can these interrupts be used in directed delivery, so that the destination
mask can include multiple VCPUs?


The Freescale MPIC does not support multiple destinations for most
interrupts, but the (non-FSL-specific) emulation code appears to allow
it.


  The MPIC manual states that timer and IPI
interrupts are supported for directed delivery, altough I'm not sure how much of
this is used in the emulation. I know that kvmppc uses the decrementer outside
of the MPIC.

- How are virtio interrupts cascaded over the shared MSI interrupts?
/proc/device-tree/soc@e000/msi@41600/interrupts in the guest shows 8 values
- 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is
that correct?


It looks like that's currently the case, but actual hardware supports
more than that, so it's possible (albeit unlikely any time soon) that
the emulation eventually does as well.

But it's possible to have interrupts other than MSIs...


Right.

So given that the raw spinlock conversion is not suitable for all the scenarios 
supported by the OpenPIC emulation, is it ok that my next step would be to send 
a patch containing both the raw spinlock conversion and a mandatory disable of 
the in-kernel MPIC? This is actually the last conclusion we came up with some 
time ago, but I guess it was good to get some more insight on how things 
actually work (at least for me).


Thanks,
Bogdan P.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-22 Thread Scott Wood

On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote:
 On 21.04.2015 03:52, Scott Wood wrote:
  On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:
  There was a weird situation for .kvmppc_mpic_set_epr - its corresponding 
  inner
  function is kvmppc_set_epr, which is a static inline. Removing the static 
  inline
  yields a compiler crash (Segmentation fault (core dumped) -
  scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' 
  failed),
  but that's a different story, so I just let it be for now. Point is the 
  time may
  include other work after the lock has been released, but before the 
  function
  actually returned. I noticed this was the case for .kvm_set_msi, which 
  could
  work up to 90 ms, not actually under the lock. This made me change what I'm
  looking at.
 
  kvm_set_msi does pretty much nothing outside the lock -- I suspect
  you're measuring an interrupt that happened as soon as the lock was
  released.
 
 That's exactly right. I've seen things like a timer interrupt occuring right 
 after the spinlock_irqrestore, but before kvm_set_msi actually returned.
 
 [...]
 
Or perhaps a different stress scenario involving a lot of VCPUs
  and external interrupts?
 
  You could instrument the MPIC code to find out how many loop iterations
  you maxed out on, and compare that to the theoretical maximum.
 
 Numbers are pretty low, and I'll try to explain based on my observations.
 
 The problematic section in openpic_update_irq is this [1], since it loops 
 through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops 
 through all pending interrupts for a VCPU [2].
 
 The guest interfaces are virtio-vhostnet, which are based on MSI 
 (/proc/interrupts in guest shows they are MSI). For external interrupts to 
 the 
 guest, the irq_source destmask is currently 0, and last_cpu is 0 
 (unitialized), 
 so [1] will go on and deliver the interrupt directly and unicast (no VCPUs 
 loop).
 
 I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many 
 interrupts 
 are actually pending for the destination VCPU. At most, there were 3 
 interrupts 
 - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that 
 guest virtio interrupts are cascaded over 1 or a couple of shared MSI 
 interrupts.
 
 So worst case, in this scenario, was checking the priorities for 3 pending 
 interrupts for 1 VCPU. Something like this (some of my prints included):
 
 [61010.582033] openpic_update_irq: destmask 1 last_cpu 0
 [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ
 [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1
 [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1
 [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1
 [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1
 
 It would be really helpful to get your comments regarding whether these are 
 realistical number for everyday use, or they are relevant only to this 
 particular scenario.

RT isn't about realistic numbers for everyday use.  It's about worst
cases.

 - Can these interrupts be used in directed delivery, so that the destination 
 mask can include multiple VCPUs?

The Freescale MPIC does not support multiple destinations for most
interrupts, but the (non-FSL-specific) emulation code appears to allow
it.

  The MPIC manual states that timer and IPI 
 interrupts are supported for directed delivery, altough I'm not sure how much 
 of 
 this is used in the emulation. I know that kvmppc uses the decrementer 
 outside 
 of the MPIC.
 
 - How are virtio interrupts cascaded over the shared MSI interrupts? 
 /proc/device-tree/soc@e000/msi@41600/interrupts in the guest shows 8 
 values 
 - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is 
 that correct?

It looks like that's currently the case, but actual hardware supports
more than that, so it's possible (albeit unlikely any time soon) that
the emulation eventually does as well.

But it's possible to have interrupts other than MSIs...

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-22 Thread Purcareata Bogdan


On 21.04.2015 03:52, Scott Wood wrote:

On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:

There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner
function is kvmppc_set_epr, which is a static inline. Removing the static inline
yields a compiler crash (Segmentation fault (core dumped) -
scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed),
but that's a different story, so I just let it be for now. Point is the time may
include other work after the lock has been released, but before the function
actually returned. I noticed this was the case for .kvm_set_msi, which could
work up to 90 ms, not actually under the lock. This made me change what I'm
looking at.


kvm_set_msi does pretty much nothing outside the lock -- I suspect
you're measuring an interrupt that happened as soon as the lock was
released.


That's exactly right. I've seen things like a timer interrupt occuring right 
after the spinlock_irqrestore, but before kvm_set_msi actually returned.


[...]


  Or perhaps a different stress scenario involving a lot of VCPUs
and external interrupts?


You could instrument the MPIC code to find out how many loop iterations
you maxed out on, and compare that to the theoretical maximum.


Numbers are pretty low, and I'll try to explain based on my observations.

The problematic section in openpic_update_irq is this [1], since it loops 
through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops 
through all pending interrupts for a VCPU [2].


The guest interfaces are virtio-vhostnet, which are based on MSI 
(/proc/interrupts in guest shows they are MSI). For external interrupts to the 
guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized), 
so [1] will go on and deliver the interrupt directly and unicast (no VCPUs loop).


I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts 
are actually pending for the destination VCPU. At most, there were 3 interrupts 
- n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that 
guest virtio interrupts are cascaded over 1 or a couple of shared MSI interrupts.


So worst case, in this scenario, was checking the priorities for 3 pending 
interrupts for 1 VCPU. Something like this (some of my prints included):


[61010.582033] openpic_update_irq: destmask 1 last_cpu 0
[61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ
[61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1
[61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1
[61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1
[61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1

It would be really helpful to get your comments regarding whether these are 
realistical number for everyday use, or they are relevant only to this 
particular scenario.


- Can these interrupts be used in directed delivery, so that the destination 
mask can include multiple VCPUs? The MPIC manual states that timer and IPI 
interrupts are supported for directed delivery, altough I'm not sure how much of 
this is used in the emulation. I know that kvmppc uses the decrementer outside 
of the MPIC.


- How are virtio interrupts cascaded over the shared MSI interrupts? 
/proc/device-tree/soc@e000/msi@41600/interrupts in the guest shows 8 values 
- 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is 
that correct?


Looking forward to your feedback.

[1] http://lxr.free-electrons.com/source/arch/powerpc/kvm/mpic.c#L454
[2] http://lxr.free-electrons.com/source/arch/powerpc/kvm/mpic.c#L303
[3] 
https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/F27971551C9EED8E8525774A0048770A/$file/mpic_db_05_16_2011.pdf


Best regards,
Bogdan P.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-20 Thread Scott Wood

On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:
 On 10.04.2015 02:53, Scott Wood wrote:
  On Thu, 2015-04-09 at 10:44 +0300, Purcareata Bogdan wrote:
  So at this point I was getting kinda frustrated so I decided to measure
  the time spend in kvm_mpic_write and kvm_mpic_read. I assumed these were
  the main entry points in the in-kernel MPIC and were basically executed
  while holding the spinlock. The scenario was the same - 24 VCPUs guest,
  with 24 virtio+vhost interfaces, only this time I ran 24 ping flood
  threads to another board instead of netperf. I assumed this would impose
  a heavier stress.
 
  The latencies look pretty ok, around 1-2 us on average, with the max
  shown below:
 
  .kvm_mpic_read 14.560
  .kvm_mpic_write12.608
 
  Those are also microseconds. This was run for about 15 mins.
 
  What about other entry points such as kvm_set_msi() and
  kvmppc_mpic_set_epr()?
 
 Thanks for the pointers! I redid the measurements, this time for the 
 functions 
 run with the openpic lock down:
 
 .kvm_mpic_read_internal (.kvm_mpic_read)  1.664
 .kvmppc_mpic_set_epr  6.880
 .kvm_mpic_write_internal (.kvm_mpic_write)7.840
 .openpic_msi_write (.kvm_set_msi) 10.560
 
 Same scenario, 15 mins, numbers are microseconds.
 
 There was a weird situation for .kvmppc_mpic_set_epr - its corresponding 
 inner 
 function is kvmppc_set_epr, which is a static inline. Removing the static 
 inline 
 yields a compiler crash (Segmentation fault (core dumped) - 
 scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' 
 failed), 
 but that's a different story, so I just let it be for now. Point is the time 
 may 
 include other work after the lock has been released, but before the function 
 actually returned. I noticed this was the case for .kvm_set_msi, which could 
 work up to 90 ms, not actually under the lock. This made me change what I'm 
 looking at.

kvm_set_msi does pretty much nothing outside the lock -- I suspect
you're measuring an interrupt that happened as soon as the lock was
released.

 So far it looks pretty decent. Are there any other MPIC entry points worthy 
 of 
 investigation?

I don't think so.

  Or perhaps a different stress scenario involving a lot of VCPUs 
 and external interrupts?

You could instrument the MPIC code to find out how many loop iterations
you maxed out on, and compare that to the theoretical maximum.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-20 Thread Purcareata Bogdan


On 10.04.2015 02:53, Scott Wood wrote:

On Thu, 2015-04-09 at 10:44 +0300, Purcareata Bogdan wrote:

So at this point I was getting kinda frustrated so I decided to measure
the time spend in kvm_mpic_write and kvm_mpic_read. I assumed these were
the main entry points in the in-kernel MPIC and were basically executed
while holding the spinlock. The scenario was the same - 24 VCPUs guest,
with 24 virtio+vhost interfaces, only this time I ran 24 ping flood
threads to another board instead of netperf. I assumed this would impose
a heavier stress.

The latencies look pretty ok, around 1-2 us on average, with the max
shown below:

.kvm_mpic_read  14.560
.kvm_mpic_write 12.608

Those are also microseconds. This was run for about 15 mins.


What about other entry points such as kvm_set_msi() and
kvmppc_mpic_set_epr()?


Thanks for the pointers! I redid the measurements, this time for the functions 
run with the openpic lock down:


.kvm_mpic_read_internal (.kvm_mpic_read)1.664
.kvmppc_mpic_set_epr6.880
.kvm_mpic_write_internal (.kvm_mpic_write)  7.840
.openpic_msi_write (.kvm_set_msi)   10.560

Same scenario, 15 mins, numbers are microseconds.

There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner 
function is kvmppc_set_epr, which is a static inline. Removing the static inline 
yields a compiler crash (Segmentation fault (core dumped) - 
scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed), 
but that's a different story, so I just let it be for now. Point is the time may 
include other work after the lock has been released, but before the function 
actually returned. I noticed this was the case for .kvm_set_msi, which could 
work up to 90 ms, not actually under the lock. This made me change what I'm 
looking at.


So far it looks pretty decent. Are there any other MPIC entry points worthy of 
investigation? Or perhaps a different stress scenario involving a lot of VCPUs 
and external interrupts?


Thanks,
Bogdan P.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-09 Thread Purcareata Bogdan


On 04.04.2015 00:26, Scott Wood wrote:

On Fri, 2015-04-03 at 11:07 +0300, Purcareata Bogdan wrote:

On 03.04.2015 02:11, Scott Wood wrote:

On Fri, 2015-03-27 at 19:07 +0200, Purcareata Bogdan wrote:

On 27.02.2015 03:05, Scott Wood wrote:

On Thu, 2015-02-26 at 14:31 +0100, Sebastian Andrzej Siewior wrote:

On 02/26/2015 02:02 PM, Paolo Bonzini wrote:



On 24/02/2015 00:27, Scott Wood wrote:

This isn't a host PIC driver.  It's guest PIC emulation, some of which
is indeed not suitable for a rawlock (in particular, openpic_update_irq
which loops on the number of vcpus, with a loop body that calls
IRQ_check() which loops over all pending IRQs).


The question is what behavior is wanted of code that isn't quite
RT-ready.  What is preferred, bugs or bad latency?

If the answer is bad latency (which can be avoided simply by not running
KVM on a RT kernel in production), patch 1 can be applied.  If the

can be applied *but* makes no difference if applied or not.


answer is bugs, patch 1 is not upstream material.

I myself prefer to have bad latency; if something takes a spinlock in
atomic context, that spinlock should be raw.  If it hurts (latency),
don't do it (use the affected code).


The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
only in -RT. There is no change upstream. In general we fix such things
in -RT first and forward the patches upstream if possible. This convert
thingy would be possible.
Bug fixing comes before latency no matter if RT or not. Converting
every lock into a rawlock is not always the answer.
Last thing I read from Scott is that he is not entirely sure if this is
the right approach or not and patch #1 was not acked-by him either.

So for now I wait for Scott's feedback and maybe a backtrace :)


Obviously leaving it in a buggy state is not what we want -- but I lean
towards a short term fix of putting depends on !PREEMPT_RT on the
in-kernel MPIC emulation (which is itself just an optimization -- you
can still use KVM without it).  This way people don't enable it with RT
without being aware of the issue, and there's more of an incentive to
fix it properly.

I'll let Bogdan supply the backtrace.


So about the backtrace. Wasn't really sure how to catch this, so what
I did was to start a 24 VCPUs guest on a 24 CPU board, and in the guest
run 24 netperf flows with an external back to back board of the same
kind. I assumed this would provide the sufficient VCPUs and external
interrupt to expose an alleged culprit.

With regards to measuring the latency, I thought of using ftrace,
specifically the preemptirqsoff latency histogram. Unfortunately, I
wasn't able to capture any major differences between running a guest
with in-kernel MPIC emulation (with the openpic raw_spinlock_conversion
applied) vs. no in-kernel MPIC emulation. Function profiling
(trace_stat) shows that in the second case there's a far greater time
spent in kvm_handle_exit (100x), but overall, the maximum latencies for
preemptirqsoff don't look that much different.

Here are the max numbers (preemptirqsoff) for the 24 CPUs, on the host
RT Linux, sorted in descending order, expressed in microseconds:

In-kernel MPIC  QEMU MPIC
39755105


What are you measuring?  Latency in the host, or in the guest?


This is in the host kernel.


Those are terrible numbers in both cases.  Can you use those tracing
tools to find out what the code path is for QEMU MPIC?


After more careful inspection, I noticed that those big-big numbers
(couple of milliseconds) are isolated cases, and in fact 99.99% of those
latencies top to somewhere around 800us. I also have a feeling that the
isolated huge latencies might have something to do with
enabling/disabling tracing, since those numbers don't come up at all in
the actual trace output, only in the latency histogram. From what I
know, there are two separate mechanisms - the function tracer and the
latency histogram.

Now, about that max 800us - there are 2 things that are enabled by
default, and can cause bad latency:
1. scheduler load balancing - latency can top to up to 800us (as seen in
the trace output).
2. RT throttling - which calls sched_rt_period_timer, which cycles
through the runqueues of all CPUs - latency can top to 600us.

I'm mentioning these since the trace output for the max preemptirqsoff
period was always stolen by these activities, basically hiding
anything related to the kvm in-kernel openpic.

I disabled both of them, and now the max preemptirqsoff trace shows a
transition between a vhost thread and the qemu process, involving a
timer and external interrupt (do_irq), which you can see at the end of
this e-mail. Not much particularly related to the kvm openpic (but
perhaps I'm not able to understand it entirely). The trace for QEMU
MPIC looks pretty much the same.

So at this point I was getting kinda frustrated so I decided to measure
the time spend in kvm_mpic_write and kvm_mpic_read. I assumed these were
the main entry

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-03 Thread Purcareata Bogdan


On 03.04.2015 02:11, Scott Wood wrote:

On Fri, 2015-03-27 at 19:07 +0200, Purcareata Bogdan wrote:

On 27.02.2015 03:05, Scott Wood wrote:

On Thu, 2015-02-26 at 14:31 +0100, Sebastian Andrzej Siewior wrote:

On 02/26/2015 02:02 PM, Paolo Bonzini wrote:



On 24/02/2015 00:27, Scott Wood wrote:

This isn't a host PIC driver.  It's guest PIC emulation, some of which
is indeed not suitable for a rawlock (in particular, openpic_update_irq
which loops on the number of vcpus, with a loop body that calls
IRQ_check() which loops over all pending IRQs).


The question is what behavior is wanted of code that isn't quite
RT-ready.  What is preferred, bugs or bad latency?

If the answer is bad latency (which can be avoided simply by not running
KVM on a RT kernel in production), patch 1 can be applied.  If the

can be applied *but* makes no difference if applied or not.


answer is bugs, patch 1 is not upstream material.

I myself prefer to have bad latency; if something takes a spinlock in
atomic context, that spinlock should be raw.  If it hurts (latency),
don't do it (use the affected code).


The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
only in -RT. There is no change upstream. In general we fix such things
in -RT first and forward the patches upstream if possible. This convert
thingy would be possible.
Bug fixing comes before latency no matter if RT or not. Converting
every lock into a rawlock is not always the answer.
Last thing I read from Scott is that he is not entirely sure if this is
the right approach or not and patch #1 was not acked-by him either.

So for now I wait for Scott's feedback and maybe a backtrace :)


Obviously leaving it in a buggy state is not what we want -- but I lean
towards a short term fix of putting depends on !PREEMPT_RT on the
in-kernel MPIC emulation (which is itself just an optimization -- you
can still use KVM without it).  This way people don't enable it with RT
without being aware of the issue, and there's more of an incentive to
fix it properly.

I'll let Bogdan supply the backtrace.


So about the backtrace. Wasn't really sure how to catch this, so what
I did was to start a 24 VCPUs guest on a 24 CPU board, and in the guest
run 24 netperf flows with an external back to back board of the same
kind. I assumed this would provide the sufficient VCPUs and external
interrupt to expose an alleged culprit.

With regards to measuring the latency, I thought of using ftrace,
specifically the preemptirqsoff latency histogram. Unfortunately, I
wasn't able to capture any major differences between running a guest
with in-kernel MPIC emulation (with the openpic raw_spinlock_conversion
applied) vs. no in-kernel MPIC emulation. Function profiling
(trace_stat) shows that in the second case there's a far greater time
spent in kvm_handle_exit (100x), but overall, the maximum latencies for
preemptirqsoff don't look that much different.

Here are the max numbers (preemptirqsoff) for the 24 CPUs, on the host
RT Linux, sorted in descending order, expressed in microseconds:

In-kernel MPIC  QEMU MPIC
39755105


What are you measuring?  Latency in the host, or in the guest?


This is in the host kernel. It's the maximum continuous period of time 
when both interrupts and preemption were disabled on the host kernel 
(basically making it unresponsive). This has been tracked while the 
guest was running with high prio, with 24 VCPUs, and in the guest there 
were 24 netperf flows - so a lot of VCPUs and a lot of external 
interrupts - for about 15 minutes.


Bogdan P.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-03 Thread Scott Wood

On Fri, 2015-04-03 at 11:07 +0300, Purcareata Bogdan wrote:
 On 03.04.2015 02:11, Scott Wood wrote:
  On Fri, 2015-03-27 at 19:07 +0200, Purcareata Bogdan wrote:
  On 27.02.2015 03:05, Scott Wood wrote:
  On Thu, 2015-02-26 at 14:31 +0100, Sebastian Andrzej Siewior wrote:
  On 02/26/2015 02:02 PM, Paolo Bonzini wrote:
 
 
  On 24/02/2015 00:27, Scott Wood wrote:
  This isn't a host PIC driver.  It's guest PIC emulation, some of which
  is indeed not suitable for a rawlock (in particular, openpic_update_irq
  which loops on the number of vcpus, with a loop body that calls
  IRQ_check() which loops over all pending IRQs).
 
  The question is what behavior is wanted of code that isn't quite
  RT-ready.  What is preferred, bugs or bad latency?
 
  If the answer is bad latency (which can be avoided simply by not running
  KVM on a RT kernel in production), patch 1 can be applied.  If the
  can be applied *but* makes no difference if applied or not.
 
  answer is bugs, patch 1 is not upstream material.
 
  I myself prefer to have bad latency; if something takes a spinlock in
  atomic context, that spinlock should be raw.  If it hurts (latency),
  don't do it (use the affected code).
 
  The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
  only in -RT. There is no change upstream. In general we fix such things
  in -RT first and forward the patches upstream if possible. This convert
  thingy would be possible.
  Bug fixing comes before latency no matter if RT or not. Converting
  every lock into a rawlock is not always the answer.
  Last thing I read from Scott is that he is not entirely sure if this is
  the right approach or not and patch #1 was not acked-by him either.
 
  So for now I wait for Scott's feedback and maybe a backtrace :)
 
  Obviously leaving it in a buggy state is not what we want -- but I lean
  towards a short term fix of putting depends on !PREEMPT_RT on the
  in-kernel MPIC emulation (which is itself just an optimization -- you
  can still use KVM without it).  This way people don't enable it with RT
  without being aware of the issue, and there's more of an incentive to
  fix it properly.
 
  I'll let Bogdan supply the backtrace.
 
  So about the backtrace. Wasn't really sure how to catch this, so what
  I did was to start a 24 VCPUs guest on a 24 CPU board, and in the guest
  run 24 netperf flows with an external back to back board of the same
  kind. I assumed this would provide the sufficient VCPUs and external
  interrupt to expose an alleged culprit.
 
  With regards to measuring the latency, I thought of using ftrace,
  specifically the preemptirqsoff latency histogram. Unfortunately, I
  wasn't able to capture any major differences between running a guest
  with in-kernel MPIC emulation (with the openpic raw_spinlock_conversion
  applied) vs. no in-kernel MPIC emulation. Function profiling
  (trace_stat) shows that in the second case there's a far greater time
  spent in kvm_handle_exit (100x), but overall, the maximum latencies for
  preemptirqsoff don't look that much different.
 
  Here are the max numbers (preemptirqsoff) for the 24 CPUs, on the host
  RT Linux, sorted in descending order, expressed in microseconds:
 
  In-kernel MPIC QEMU MPIC
  3975   5105
 
  What are you measuring?  Latency in the host, or in the guest?
 
 This is in the host kernel.

Those are terrible numbers in both cases.  Can you use those tracing
tools to find out what the code path is for QEMU MPIC?

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-02 Thread Scott Wood

On Fri, 2015-03-27 at 19:07 +0200, Purcareata Bogdan wrote:
 On 27.02.2015 03:05, Scott Wood wrote:
  On Thu, 2015-02-26 at 14:31 +0100, Sebastian Andrzej Siewior wrote:
  On 02/26/2015 02:02 PM, Paolo Bonzini wrote:
 
 
  On 24/02/2015 00:27, Scott Wood wrote:
  This isn't a host PIC driver.  It's guest PIC emulation, some of which
  is indeed not suitable for a rawlock (in particular, openpic_update_irq
  which loops on the number of vcpus, with a loop body that calls
  IRQ_check() which loops over all pending IRQs).
 
  The question is what behavior is wanted of code that isn't quite
  RT-ready.  What is preferred, bugs or bad latency?
 
  If the answer is bad latency (which can be avoided simply by not running
  KVM on a RT kernel in production), patch 1 can be applied.  If the
  can be applied *but* makes no difference if applied or not.
 
  answer is bugs, patch 1 is not upstream material.
 
  I myself prefer to have bad latency; if something takes a spinlock in
  atomic context, that spinlock should be raw.  If it hurts (latency),
  don't do it (use the affected code).
 
  The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
  only in -RT. There is no change upstream. In general we fix such things
  in -RT first and forward the patches upstream if possible. This convert
  thingy would be possible.
  Bug fixing comes before latency no matter if RT or not. Converting
  every lock into a rawlock is not always the answer.
  Last thing I read from Scott is that he is not entirely sure if this is
  the right approach or not and patch #1 was not acked-by him either.
 
  So for now I wait for Scott's feedback and maybe a backtrace :)
 
  Obviously leaving it in a buggy state is not what we want -- but I lean
  towards a short term fix of putting depends on !PREEMPT_RT on the
  in-kernel MPIC emulation (which is itself just an optimization -- you
  can still use KVM without it).  This way people don't enable it with RT
  without being aware of the issue, and there's more of an incentive to
  fix it properly.
 
  I'll let Bogdan supply the backtrace.
 
 So about the backtrace. Wasn't really sure how to catch this, so what 
 I did was to start a 24 VCPUs guest on a 24 CPU board, and in the guest 
 run 24 netperf flows with an external back to back board of the same 
 kind. I assumed this would provide the sufficient VCPUs and external 
 interrupt to expose an alleged culprit.
 
 With regards to measuring the latency, I thought of using ftrace, 
 specifically the preemptirqsoff latency histogram. Unfortunately, I 
 wasn't able to capture any major differences between running a guest 
 with in-kernel MPIC emulation (with the openpic raw_spinlock_conversion 
 applied) vs. no in-kernel MPIC emulation. Function profiling 
 (trace_stat) shows that in the second case there's a far greater time 
 spent in kvm_handle_exit (100x), but overall, the maximum latencies for 
 preemptirqsoff don't look that much different.
 
 Here are the max numbers (preemptirqsoff) for the 24 CPUs, on the host 
 RT Linux, sorted in descending order, expressed in microseconds:
 
 In-kernel MPICQEMU MPIC
 3975  5105

What are you measuring?  Latency in the host, or in the guest?

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-03-27 Thread Purcareata Bogdan


On 27.02.2015 03:05, Scott Wood wrote:

On Thu, 2015-02-26 at 14:31 +0100, Sebastian Andrzej Siewior wrote:

On 02/26/2015 02:02 PM, Paolo Bonzini wrote:



On 24/02/2015 00:27, Scott Wood wrote:

This isn't a host PIC driver.  It's guest PIC emulation, some of which
is indeed not suitable for a rawlock (in particular, openpic_update_irq
which loops on the number of vcpus, with a loop body that calls
IRQ_check() which loops over all pending IRQs).


The question is what behavior is wanted of code that isn't quite
RT-ready.  What is preferred, bugs or bad latency?

If the answer is bad latency (which can be avoided simply by not running
KVM on a RT kernel in production), patch 1 can be applied.  If the

can be applied *but* makes no difference if applied or not.


answer is bugs, patch 1 is not upstream material.

I myself prefer to have bad latency; if something takes a spinlock in
atomic context, that spinlock should be raw.  If it hurts (latency),
don't do it (use the affected code).


The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
only in -RT. There is no change upstream. In general we fix such things
in -RT first and forward the patches upstream if possible. This convert
thingy would be possible.
Bug fixing comes before latency no matter if RT or not. Converting
every lock into a rawlock is not always the answer.
Last thing I read from Scott is that he is not entirely sure if this is
the right approach or not and patch #1 was not acked-by him either.

So for now I wait for Scott's feedback and maybe a backtrace :)


Obviously leaving it in a buggy state is not what we want -- but I lean
towards a short term fix of putting depends on !PREEMPT_RT on the
in-kernel MPIC emulation (which is itself just an optimization -- you
can still use KVM without it).  This way people don't enable it with RT
without being aware of the issue, and there's more of an incentive to
fix it properly.

I'll let Bogdan supply the backtrace.


So about the backtrace. Wasn't really sure how to catch this, so what 
I did was to start a 24 VCPUs guest on a 24 CPU board, and in the guest 
run 24 netperf flows with an external back to back board of the same 
kind. I assumed this would provide the sufficient VCPUs and external 
interrupt to expose an alleged culprit.


With regards to measuring the latency, I thought of using ftrace, 
specifically the preemptirqsoff latency histogram. Unfortunately, I 
wasn't able to capture any major differences between running a guest 
with in-kernel MPIC emulation (with the openpic raw_spinlock_conversion 
applied) vs. no in-kernel MPIC emulation. Function profiling 
(trace_stat) shows that in the second case there's a far greater time 
spent in kvm_handle_exit (100x), but overall, the maximum latencies for 
preemptirqsoff don't look that much different.


Here are the max numbers (preemptirqsoff) for the 24 CPUs, on the host 
RT Linux, sorted in descending order, expressed in microseconds:


In-kernel MPIC  QEMU MPIC
39755105
20793972
13033557
11061725
447 907
423 853
362 723
343 182
260 121
133 116
131 116
118 115
116 114
114 114
114 114
114 99
113 99
103 98
98  98
95  97
87  96
83  83
83  82
80  81

I'm not sure if this captures openpic behavior or just scheduler behavior.

Anyways, I'm pro adding the openpic raw_spinlock conversion along with 
disabling the in-kernel MPIC emulation for upstream. But just wanted to 
catch up with this last request from a while ago.


Do you think it would be better to just submit the new patch or should I 
do some further testing? Do you have any suggestions regarding what else 
I should look at / how to test?


Thank you,
Bogdan P.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-27 Thread Paolo Bonzini



On 27/02/2015 02:05, Scott Wood wrote:
 Obviously leaving it in a buggy state is not what we want -- but I lean
 towards a short term fix of putting depends on !PREEMPT_RT on the
 in-kernel MPIC emulation (which is itself just an optimization -- you
 can still use KVM without it).  This way people don't enable it with RT
 without being aware of the issue, and there's more of an incentive to
 fix it properly.

That would indeed work for me.

Paolo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-26 Thread Sebastian Andrzej Siewior

On 02/26/2015 02:02 PM, Paolo Bonzini wrote:
 
 
 On 24/02/2015 00:27, Scott Wood wrote:
 This isn't a host PIC driver.  It's guest PIC emulation, some of which
 is indeed not suitable for a rawlock (in particular, openpic_update_irq
 which loops on the number of vcpus, with a loop body that calls
 IRQ_check() which loops over all pending IRQs).
 
 The question is what behavior is wanted of code that isn't quite
 RT-ready.  What is preferred, bugs or bad latency?
 
 If the answer is bad latency (which can be avoided simply by not running
 KVM on a RT kernel in production), patch 1 can be applied.  If the
can be applied *but* makes no difference if applied or not.

 answer is bugs, patch 1 is not upstream material.
 
 I myself prefer to have bad latency; if something takes a spinlock in
 atomic context, that spinlock should be raw.  If it hurts (latency),
 don't do it (use the affected code).

The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
only in -RT. There is no change upstream. In general we fix such things
in -RT first and forward the patches upstream if possible. This convert
thingy would be possible.
Bug fixing comes before latency no matter if RT or not. Converting
every lock into a rawlock is not always the answer.
Last thing I read from Scott is that he is not entirely sure if this is
the right approach or not and patch #1 was not acked-by him either.

So for now I wait for Scott's feedback and maybe a backtrace :)

 
 Paolo

Sebastian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-26 Thread Paolo Bonzini



On 24/02/2015 00:27, Scott Wood wrote:
 This isn't a host PIC driver.  It's guest PIC emulation, some of which
 is indeed not suitable for a rawlock (in particular, openpic_update_irq
 which loops on the number of vcpus, with a loop body that calls
 IRQ_check() which loops over all pending IRQs).

The question is what behavior is wanted of code that isn't quite
RT-ready.  What is preferred, bugs or bad latency?

If the answer is bad latency (which can be avoided simply by not running
KVM on a RT kernel in production), patch 1 can be applied.  If the
answer is bugs, patch 1 is not upstream material.

I myself prefer to have bad latency; if something takes a spinlock in
atomic context, that spinlock should be raw.  If it hurts (latency),
don't do it (use the affected code).

Paolo

 The vcpu limits are a
 temporary bandaid to avoid the worst latencies, but I'm still skeptical
 about this being upstream material.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-26 Thread Scott Wood

On Thu, 2015-02-26 at 14:31 +0100, Sebastian Andrzej Siewior wrote:
 On 02/26/2015 02:02 PM, Paolo Bonzini wrote:
  
  
  On 24/02/2015 00:27, Scott Wood wrote:
  This isn't a host PIC driver.  It's guest PIC emulation, some of which
  is indeed not suitable for a rawlock (in particular, openpic_update_irq
  which loops on the number of vcpus, with a loop body that calls
  IRQ_check() which loops over all pending IRQs).
  
  The question is what behavior is wanted of code that isn't quite
  RT-ready.  What is preferred, bugs or bad latency?
  
  If the answer is bad latency (which can be avoided simply by not running
  KVM on a RT kernel in production), patch 1 can be applied.  If the
 can be applied *but* makes no difference if applied or not.
 
  answer is bugs, patch 1 is not upstream material.
  
  I myself prefer to have bad latency; if something takes a spinlock in
  atomic context, that spinlock should be raw.  If it hurts (latency),
  don't do it (use the affected code).
 
 The problem, that is fixed by this s/spin_lock/raw_spin_lock/, exists
 only in -RT. There is no change upstream. In general we fix such things
 in -RT first and forward the patches upstream if possible. This convert
 thingy would be possible.
 Bug fixing comes before latency no matter if RT or not. Converting
 every lock into a rawlock is not always the answer.
 Last thing I read from Scott is that he is not entirely sure if this is
 the right approach or not and patch #1 was not acked-by him either.
 
 So for now I wait for Scott's feedback and maybe a backtrace :)

Obviously leaving it in a buggy state is not what we want -- but I lean
towards a short term fix of putting depends on !PREEMPT_RT on the
in-kernel MPIC emulation (which is itself just an optimization -- you
can still use KVM without it).  This way people don't enable it with RT
without being aware of the issue, and there's more of an incentive to
fix it properly.

I'll let Bogdan supply the backtrace.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-25 Thread Sebastian Andrzej Siewior

* Scott Wood | 2015-02-23 17:27:31 [-0600]:

This isn't a host PIC driver.  It's guest PIC emulation, some of which
is indeed not suitable for a rawlock (in particular, openpic_update_irq
which loops on the number of vcpus, with a loop body that calls
IRQ_check() which loops over all pending IRQs).  The vcpu limits are a
temporary bandaid to avoid the worst latencies, but I'm still skeptical
about this being upstream material.

Okay.

-Scott

Sebastian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-23 Thread Purcareata Bogdan


On 20.02.2015 17:17, Sebastian Andrzej Siewior wrote:

On 02/20/2015 04:10 PM, Paolo Bonzini wrote:

On 20/02/2015 16:06, Sebastian Andrzej Siewior wrote:

On 02/20/2015 03:57 PM, Paolo Bonzini wrote:



Yes, but large latencies just mean the code has to be rewritten (x86
doesn't anymore do event injection in an atomic regions for example).
Until it is, using raw_spin_lock is correct.


It does not sound like it. It sounds more like disabling interrupts to
get things run faster and then limit it on a different corner to not
blow up everything.


This patchset enables running KVM SMP guests with external interrupts
on an underlying RT-enabled Linux. Previous to this patch, a guest with
in-kernel MPIC emulation could easily panic the kernel due to preemption
when delivering IPIs and external interrupts, because of the openpic
spinlock becoming a sleeping mutex on PREEMPT_RT_FULL Linux.


Max latencies was decreased Max latency (us)  7062 and that
is why this is done? For 8 us and possible DoS in case there are too
many cpus?


My understanding is that:

1) netperf can get you a BUG KVM, and raw_spinlock fixes that


Actually, it's not just netperf. The bug triggers in the following 
scenarios:
- running CPU intensive task (while true; do date; done) in SMP guest 
(even with 2 VCPUs)

- running netperf in guest
- running cyclictest in SMP guest


May I please see a backtrace with context tracking which states where
the interrupts / preemption gets disabled and where the lock was taken?


Will do, I will get back to you as soon as I have it available. I will 
try and capture it using function trace.



I'm not totally against this patch I just want to make sure this is not
a blind raw conversation to shup up the warning the kernel throws.


2) cyclictest did not trigger the BUG, and you can also get reduced
latency from using raw_spinlock.

I think we agree that (2) is not a factor in accepting the patch.

good :)



Paolo


Sebastian


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-23 Thread Scott Wood

On Fri, 2015-02-20 at 15:54 +0100, Sebastian Andrzej Siewior wrote:
 On 02/20/2015 03:12 PM, Paolo Bonzini wrote:
  Thomas, what is the usual approach for patches like this? Do you take
  them into your rt tree or should they get integrated to upstream?
  
  Patch 1 is definitely suitable for upstream, that's the reason why we
  have raw_spin_lock vs. raw_spin_unlock.
 
 raw_spin_lock were introduced in c2f21ce2e31286a0a32 (locking:
 Implement new raw_spinlock). They are used in context which runs with
 IRQs off - especially on -RT. This includes usually interrupt
 controllers and related core-code pieces.
 
 Usually you see scheduling while atomic on -RT and convert them to
 raw locks if it is appropriate.
 
 Bogdan wrote in 2/2 that he needs to limit the number of CPUs in oder
 not cause a DoS and large latencies in the host. I haven't seen an
 answer to my why question. Because if the conversation leads to
 large latencies in the host then it does not look right.
 
 Each host PIC has a rawlock and does mostly just mask/unmask and the
 raw lock makes sure the value written is not mixed up due to
 preemption.
 This hardly increase latencies because the locked path is very short.
 If this conversation leads to higher latencies then the locked path is
 too long and hardly suitable to become a rawlock.

This isn't a host PIC driver.  It's guest PIC emulation, some of which
is indeed not suitable for a rawlock (in particular, openpic_update_irq
which loops on the number of vcpus, with a loop body that calls
IRQ_check() which loops over all pending IRQs).  The vcpu limits are a
temporary bandaid to avoid the worst latencies, but I'm still skeptical
about this being upstream material.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-22 Thread Purcareata Bogdan


On 20.02.2015 17:06, Sebastian Andrzej Siewior wrote:

On 02/20/2015 03:57 PM, Paolo Bonzini wrote:



On 20/02/2015 15:54, Sebastian Andrzej Siewior wrote:

Usually you see scheduling while atomic on -RT and convert them to
raw locks if it is appropriate.

Bogdan wrote in 2/2 that he needs to limit the number of CPUs in oder
not cause a DoS and large latencies in the host. I haven't seen an
answer to my why question. Because if the conversation leads to
large latencies in the host then it does not look right.

Each host PIC has a rawlock and does mostly just mask/unmask and the
raw lock makes sure the value written is not mixed up due to
preemption.
This hardly increase latencies because the locked path is very short.
If this conversation leads to higher latencies then the locked path is
too long and hardly suitable to become a rawlock.


Yes, but large latencies just mean the code has to be rewritten (x86
doesn't anymore do event injection in an atomic regions for example).
Until it is, using raw_spin_lock is correct.


It does not sound like it. It sounds more like disabling interrupts to
get things run faster and then limit it on a different corner to not
blow up everything.
Max latencies was decreased Max latency (us)  7062 and that
is why this is done? For 8 us and possible DoS in case there are too
many cpus?


The main reason for this patch was to enable KVM guests to run on RT 
hosts in certain scenarios, such as delivering external interrupts to 
the guest and the guest being SMP. The cyclictest measurements were just 
a sanity check to make sure the latencies don't get messed up too bad, 
albeit in a light scenario (guest with 1 VCPU), for a use case when the 
guest is not SMP and doesn't have any external interrupts delivered. 
This latter scenario works even without the kvm openpic being a 
raw_spinlock.


Previous to this patch, KVM was indeed blowing up on guest_enter [1], 
and making the openpic lock a raw_spinlock fixes that, without causing 
too much cyclictest damage when the guest doesn't have many VCPUs. I had 
a discussion with Scott Wood a while ago regarding delivering external 
interrupts to the guest, and he mentioned that the correct solution was 
to rework the entire interrupt delivery mechanism into multiple lock 
domains, minimize the code on the EPR path and the locking involved. 
Until that can be achieved, converting the openpic lock to a 
raw_spinlock would be acceptable, as long as we keep the number of guest 
VCPUs small, so as to not cause big host latencies.


[1] http://lxr.free-electrons.com/source/include/linux/kvm_host.h#L762

Best regards,
Bogdan P.


Paolo



Sebastian


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-22 Thread Purcareata Bogdan


On 20.02.2015 16:54, Sebastian Andrzej Siewior wrote:

On 02/20/2015 03:12 PM, Paolo Bonzini wrote:

Thomas, what is the usual approach for patches like this? Do you take
them into your rt tree or should they get integrated to upstream?


Patch 1 is definitely suitable for upstream, that's the reason why we
have raw_spin_lock vs. raw_spin_unlock.


raw_spin_lock were introduced in c2f21ce2e31286a0a32 (locking:
Implement new raw_spinlock). They are used in context which runs with
IRQs off - especially on -RT. This includes usually interrupt
controllers and related core-code pieces.

Usually you see scheduling while atomic on -RT and convert them to
raw locks if it is appropriate.

Bogdan wrote in 2/2 that he needs to limit the number of CPUs in oder
not cause a DoS and large latencies in the host. I haven't seen an
answer to my why question. Because if the conversation leads to
large latencies in the host then it does not look right.


What I did notice were bad cyclictest results, when run in a guest with 
24 VCPUs. There were 24 netperf flows running in the guest. The max 
cyclictest latencies got up to 15ms in the guest, however I haven't 
captured any host side information related to preemptirqs off statistics.


What I was planning to do in the past days was to rerun the test and 
come up with the host preemptirqs off disabled statistics (mainly the 
max latency), so I could have a more reliable argument. I haven't had 
the time nor the setup to do that yet, and will come back with this as 
soon as I have them available.



Each host PIC has a rawlock and does mostly just mask/unmask and the
raw lock makes sure the value written is not mixed up due to
preemption.
This hardly increase latencies because the locked path is very short.
If this conversation leads to higher latencies then the locked path is
too long and hardly suitable to become a rawlock.


From my understanding, the kvm openpic emulation code does more than 
just that - it requires to be atomic with interrupt delivery. This might 
mean the bad cyclictest max latencies visible from the guest side 
(15ms), may also have a correspondent to how much time that raw spinlock 
is taken, leading to an unresponsive host.


Best regards,
Bogdan P.


Paolo



Sebastian


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Paolo Bonzini



On 20/02/2015 14:45, Alexander Graf wrote:
 
 
 On 18.02.15 10:32, Bogdan Purcareata wrote:
 This patchset enables running KVM SMP guests with external interrupts on an
 underlying RT-enabled Linux. Previous to this patch, a guest with in-kernel 
 MPIC
 emulation could easily panic the kernel due to preemption when delivering 
 IPIs
 and external interrupts, because of the openpic spinlock becoming a sleeping
 mutex on PREEMPT_RT_FULL Linux.

 0001: converts the openpic spinlock to a raw spinlock, in order to circumvent
 this behavior. While this change is targeted for a RT enabled Linux, it has 
 no
 effect on upstream kvm-ppc, so send it upstream for better future 
 maintenance.

 0002: introduces a limit on the maximum VCPUs a guest can have, in order to
 prevent potential DoS attack due to large system latencies. This patch is
 targeted to RT (due to CONFIG_PREEMPT_RT_FULL), but it can also be applied on
 upstream Linux, with no effect. Not sure if it's best to send it upstream and
 have a hanging CONFIG_PREEMPT_RT_FULL check there, with no effect, or send it
 against linux-stable-rt. Please apply as you consider appropriate.
 
 Thomas, what is the usual approach for patches like this? Do you take
 them into your rt tree or should they get integrated to upstream?

Patch 1 is definitely suitable for upstream, that's the reason why we
have raw_spin_lock vs. raw_spin_unlock.

Paolo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Alexander Graf



On 18.02.15 10:32, Bogdan Purcareata wrote:
 This patchset enables running KVM SMP guests with external interrupts on an
 underlying RT-enabled Linux. Previous to this patch, a guest with in-kernel 
 MPIC
 emulation could easily panic the kernel due to preemption when delivering IPIs
 and external interrupts, because of the openpic spinlock becoming a sleeping
 mutex on PREEMPT_RT_FULL Linux.
 
 0001: converts the openpic spinlock to a raw spinlock, in order to circumvent
 this behavior. While this change is targeted for a RT enabled Linux, it has no
 effect on upstream kvm-ppc, so send it upstream for better future maintenance.
 
 0002: introduces a limit on the maximum VCPUs a guest can have, in order to
 prevent potential DoS attack due to large system latencies. This patch is
 targeted to RT (due to CONFIG_PREEMPT_RT_FULL), but it can also be applied on
 upstream Linux, with no effect. Not sure if it's best to send it upstream and
 have a hanging CONFIG_PREEMPT_RT_FULL check there, with no effect, or send it
 against linux-stable-rt. Please apply as you consider appropriate.

Thomas, what is the usual approach for patches like this? Do you take
them into your rt tree or should they get integrated to upstream?


Alex
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Alexander Graf



On 20.02.15 15:12, Paolo Bonzini wrote:
 
 
 On 20/02/2015 14:45, Alexander Graf wrote:


 On 18.02.15 10:32, Bogdan Purcareata wrote:
 This patchset enables running KVM SMP guests with external interrupts on an
 underlying RT-enabled Linux. Previous to this patch, a guest with in-kernel 
 MPIC
 emulation could easily panic the kernel due to preemption when delivering 
 IPIs
 and external interrupts, because of the openpic spinlock becoming a sleeping
 mutex on PREEMPT_RT_FULL Linux.

 0001: converts the openpic spinlock to a raw spinlock, in order to 
 circumvent
 this behavior. While this change is targeted for a RT enabled Linux, it has 
 no
 effect on upstream kvm-ppc, so send it upstream for better future 
 maintenance.

 0002: introduces a limit on the maximum VCPUs a guest can have, in order to
 prevent potential DoS attack due to large system latencies. This patch is
 targeted to RT (due to CONFIG_PREEMPT_RT_FULL), but it can also be applied 
 on
 upstream Linux, with no effect. Not sure if it's best to send it upstream 
 and
 have a hanging CONFIG_PREEMPT_RT_FULL check there, with no effect, or send 
 it
 against linux-stable-rt. Please apply as you consider appropriate.

 Thomas, what is the usual approach for patches like this? Do you take
 them into your rt tree or should they get integrated to upstream?
 
 Patch 1 is definitely suitable for upstream, that's the reason why we
 have raw_spin_lock vs. raw_spin_unlock.

I see, perfect :).

Bogdan, please resend patch 1 with CC to kvm-ppc@vger so that I can pick
it up from patchworks.


Alex
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Sebastian Andrzej Siewior

On 02/20/2015 03:12 PM, Paolo Bonzini wrote:
 Thomas, what is the usual approach for patches like this? Do you take
 them into your rt tree or should they get integrated to upstream?
 
 Patch 1 is definitely suitable for upstream, that's the reason why we
 have raw_spin_lock vs. raw_spin_unlock.

raw_spin_lock were introduced in c2f21ce2e31286a0a32 (locking:
Implement new raw_spinlock). They are used in context which runs with
IRQs off - especially on -RT. This includes usually interrupt
controllers and related core-code pieces.

Usually you see scheduling while atomic on -RT and convert them to
raw locks if it is appropriate.

Bogdan wrote in 2/2 that he needs to limit the number of CPUs in oder
not cause a DoS and large latencies in the host. I haven't seen an
answer to my why question. Because if the conversation leads to
large latencies in the host then it does not look right.

Each host PIC has a rawlock and does mostly just mask/unmask and the
raw lock makes sure the value written is not mixed up due to
preemption.
This hardly increase latencies because the locked path is very short.
If this conversation leads to higher latencies then the locked path is
too long and hardly suitable to become a rawlock.

 Paolo
 

Sebastian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Paolo Bonzini



On 20/02/2015 15:54, Sebastian Andrzej Siewior wrote:
 Usually you see scheduling while atomic on -RT and convert them to
 raw locks if it is appropriate.
 
 Bogdan wrote in 2/2 that he needs to limit the number of CPUs in oder
 not cause a DoS and large latencies in the host. I haven't seen an
 answer to my why question. Because if the conversation leads to
 large latencies in the host then it does not look right.
 
 Each host PIC has a rawlock and does mostly just mask/unmask and the
 raw lock makes sure the value written is not mixed up due to
 preemption.
 This hardly increase latencies because the locked path is very short.
 If this conversation leads to higher latencies then the locked path is
 too long and hardly suitable to become a rawlock.

Yes, but large latencies just mean the code has to be rewritten (x86
doesn't anymore do event injection in an atomic regions for example).
Until it is, using raw_spin_lock is correct.

Paolo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Sebastian Andrzej Siewior

On 02/20/2015 03:57 PM, Paolo Bonzini wrote:
 
 
 On 20/02/2015 15:54, Sebastian Andrzej Siewior wrote:
 Usually you see scheduling while atomic on -RT and convert them to
 raw locks if it is appropriate.

 Bogdan wrote in 2/2 that he needs to limit the number of CPUs in oder
 not cause a DoS and large latencies in the host. I haven't seen an
 answer to my why question. Because if the conversation leads to
 large latencies in the host then it does not look right.

 Each host PIC has a rawlock and does mostly just mask/unmask and the
 raw lock makes sure the value written is not mixed up due to
 preemption.
 This hardly increase latencies because the locked path is very short.
 If this conversation leads to higher latencies then the locked path is
 too long and hardly suitable to become a rawlock.
 
 Yes, but large latencies just mean the code has to be rewritten (x86
 doesn't anymore do event injection in an atomic regions for example).
 Until it is, using raw_spin_lock is correct.

It does not sound like it. It sounds more like disabling interrupts to
get things run faster and then limit it on a different corner to not
blow up everything.
Max latencies was decreased Max latency (us)  7062 and that
is why this is done? For 8 us and possible DoS in case there are too
many cpus?

 Paolo
 

Sebastian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Sebastian Andrzej Siewior

On 02/20/2015 04:10 PM, Paolo Bonzini wrote:
 On 20/02/2015 16:06, Sebastian Andrzej Siewior wrote:
 On 02/20/2015 03:57 PM, Paolo Bonzini wrote:
 
 Yes, but large latencies just mean the code has to be rewritten (x86
 doesn't anymore do event injection in an atomic regions for example).
 Until it is, using raw_spin_lock is correct.

 It does not sound like it. It sounds more like disabling interrupts to
 get things run faster and then limit it on a different corner to not
 blow up everything.
 
 This patchset enables running KVM SMP guests with external interrupts
 on an underlying RT-enabled Linux. Previous to this patch, a guest with
 in-kernel MPIC emulation could easily panic the kernel due to preemption
 when delivering IPIs and external interrupts, because of the openpic
 spinlock becoming a sleeping mutex on PREEMPT_RT_FULL Linux.
 
 Max latencies was decreased Max latency (us)  7062 and that
 is why this is done? For 8 us and possible DoS in case there are too
 many cpus?
 
 My understanding is that:
 
 1) netperf can get you a BUG KVM, and raw_spinlock fixes that

May I please see a backtrace with context tracking which states where
the interrupts / preemption gets disabled and where the lock was taken?

I'm not totally against this patch I just want to make sure this is not
a blind raw conversation to shup up the warning the kernel throws.

 2) cyclictest did not trigger the BUG, and you can also get reduced
 latency from using raw_spinlock.
 
 I think we agree that (2) is not a factor in accepting the patch.
good :)

 
 Paolo
 
Sebastian
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-20 Thread Paolo Bonzini



On 20/02/2015 16:06, Sebastian Andrzej Siewior wrote:
 On 02/20/2015 03:57 PM, Paolo Bonzini wrote:

 Yes, but large latencies just mean the code has to be rewritten (x86
 doesn't anymore do event injection in an atomic regions for example).
 Until it is, using raw_spin_lock is correct.
 
 It does not sound like it. It sounds more like disabling interrupts to
 get things run faster and then limit it on a different corner to not
 blow up everything.

This patchset enables running KVM SMP guests with external interrupts
on an underlying RT-enabled Linux. Previous to this patch, a guest with
in-kernel MPIC emulation could easily panic the kernel due to preemption
when delivering IPIs and external interrupts, because of the openpic
spinlock becoming a sleeping mutex on PREEMPT_RT_FULL Linux.

 Max latencies was decreased Max latency (us)  7062 and that
 is why this is done? For 8 us and possible DoS in case there are too
 many cpus?

My understanding is that:

1) netperf can get you a BUG KVM, and raw_spinlock fixes that

2) cyclictest did not trigger the BUG, and you can also get reduced
latency from using raw_spinlock.

I think we agree that (2) is not a factor in accepting the patch.

Paolo

 Paolo

 
 Sebastian
 --
 To unsubscribe from this list: send the line unsubscribe linux-rt-users in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-02-18 Thread Bogdan Purcareata

This patchset enables running KVM SMP guests with external interrupts on an
underlying RT-enabled Linux. Previous to this patch, a guest with in-kernel MPIC
emulation could easily panic the kernel due to preemption when delivering IPIs
and external interrupts, because of the openpic spinlock becoming a sleeping
mutex on PREEMPT_RT_FULL Linux.

0001: converts the openpic spinlock to a raw spinlock, in order to circumvent
this behavior. While this change is targeted for a RT enabled Linux, it has no
effect on upstream kvm-ppc, so send it upstream for better future maintenance.

0002: introduces a limit on the maximum VCPUs a guest can have, in order to
prevent potential DoS attack due to large system latencies. This patch is
targeted to RT (due to CONFIG_PREEMPT_RT_FULL), but it can also be applied on
upstream Linux, with no effect. Not sure if it's best to send it upstream and
have a hanging CONFIG_PREEMPT_RT_FULL check there, with no effect, or send it
against linux-stable-rt. Please apply as you consider appropriate.

- applied  compiled against upstream 3.19
- applied  compiled against stable-rt 3.14-rt (0002 with minor fuzz)

Bogdan Purcareata (2):
  powerpc/kvm: Convert openpic lock to raw_spinlock
  powerpc/kvm: Limit MAX_VCPUS for guests running on RT Linux

 arch/powerpc/include/asm/kvm_host.h |  6 +
 arch/powerpc/kvm/mpic.c | 44 ++---
 2 files changed, 28 insertions(+), 22 deletions(-)

-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

[PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

30 matches

Site Navigation

Mail list logo

Footer information