Re: [Xen-devel] [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model

2014-11-28 Thread Ian Campbell
On Thu, 2014-11-27 at 18:02 +, Julien Grall wrote:
 state at the GIC level. This would also avoid masking the output signal
 and requires specific handling in the guest OS.

which requires?

It doesn't seem quite right to me otherwise, since context switching the
virq state *removes* the need to have the guest do anything other than
what it would do on native.

Assuming this is what you meant I propose (fixing some grammar etc as I
go):

xen/arm: Handle platforms with edge-triggered virtual timer

Some platforms (such as the ARMv8 model) use an edge-triggered interrupt
for the virtual timer. Even if the timer output signal is masked in the
context switch, the GIC will keep track that of any interrupts raised
while IRQs are disabled. As soon as IRQs are re-enabled, the virtual
interrupt timer will be injected to Xen.

If an idle vVCPU was scheduled next then the interrupt handler doesn't
expect to the receive the IRQ and will crash:

(XEN)[00228388] _spin_lock_irqsave+0x28/0x94 (PC)
(XEN)[00228380] _spin_lock_irqsave+0x20/0x94 (LR)
(XEN)[00250510] vgic_vcpu_inject_irq+0x40/0x1b0
(XEN)[0024bcd0] vtimer_interrupt+0x4c/0x54
(XEN)[00247010] do_IRQ+0x1a4/0x220
(XEN)[00244864] gic_interrupt+0x50/0xec
(XEN)[0024fbac] do_trap_irq+0x20/0x2c
(XEN)[00255240] hyp_irq+0x5c/0x60
(XEN)[00241084] context_switch+0xb8/0xc4
(XEN)[0022482c] schedule+0x684/0x6d0
(XEN)[0022785c] __do_softirq+0xcc/0xe8
(XEN)[002278d4] do_softirq+0x14/0x1c
(XEN)[00240fac] idle_loop+0x134/0x154
(XEN)[0024c160] start_secondary+0x14c/0x15c
(XEN)[0001] 0001

The proper solution is to context switch the virtual interrupt state at
the GIC level. This would also avoid masking the output signal which
requires specific handling in the guest OS and more complex code in Xen
to deal with EOIs, and so is desirable for that reason too. 

Sadly, this solution requires some refactoring which would not be
suitable for a freeze exception for the Xen 4.5 release.

For now implement a temporary solution which ignores the virtual timer
interrupt when the idle VCPU is running.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model

2014-11-27 Thread Ian Campbell
On Tue, 2014-11-25 at 17:44 +, Julien Grall wrote:
 ARMv8 model may not disable correctly the timer interrupt when Xen

correct disable

 context switch to an idle vCPU. Therefore Xen may receive a spurious

context switches and s/spurious/unexpected/ (since spurious has a
specific meaning in the h/w which does not match what is happening here)

 timer interrupt. As the idle domain doesn't have vGIC, Xen will crash
 when trying to inject the interrupt with the following stack trace.
 
 (XEN)[00228388] _spin_lock_irqsave+0x28/0x94 (PC)
 (XEN)[00228380] _spin_lock_irqsave+0x20/0x94 (LR)
 (XEN)[00250510] vgic_vcpu_inject_irq+0x40/0x1b0
 (XEN)[0024bcd0] vtimer_interrupt+0x4c/0x54
 (XEN)[00247010] do_IRQ+0x1a4/0x220
 (XEN)[00244864] gic_interrupt+0x50/0xec
 (XEN)[0024fbac] do_trap_irq+0x20/0x2c
 (XEN)[00255240] hyp_irq+0x5c/0x60
 (XEN)[00241084] context_switch+0xb8/0xc4
 (XEN)[0022482c] schedule+0x684/0x6d0
 (XEN)[0022785c] __do_softirq+0xcc/0xe8
 (XEN)[002278d4] do_softirq+0x14/0x1c
 (XEN)[00240fac] idle_loop+0x134/0x154
 (XEN)[0024c160] start_secondary+0x14c/0x15c
 (XEN)[0001] 0001
 
 While we receive spurious virtual timer interrupt, this could be safely
 ignore for the time being. A proper fix need to be found for Xen 4.6.
 
 Signed-off-by: Julien Grall julien.gr...@linaro.org

Acked-by: Ian Campbell ian.campb...@citrix.com

Although I wonder if we should log, perhaps rate limited or only once.

Also, I've some grammar nits (above and below) which I can fix on commit
if there is no resend...

 
 ---
 
 This patch is a bug fix candidate for Xen 4.5. Any ARMv8 model may
 randomly crash when running Xen.

CCing Konrad.

 This patch don't inject the virtual timer interrupt if the current VCPU
 is the idle one. Entering in this function with the idle VCPU is already
 a bug itself. For now, I think this patch is the safest way to resolve
 the problem.
 
 Meanwhile, I'm investigating with ARM to see wheter the bug comes from
 Xen or the model.
 ---
  xen/arch/arm/time.c | 8 
  1 file changed, 8 insertions(+)
 
 diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
 index a6436f1..83c74cb 100644
 --- a/xen/arch/arm/time.c
 +++ b/xen/arch/arm/time.c
 @@ -169,6 +169,14 @@ static void timer_interrupt(int irq, void *dev_id, 
 struct cpu_user_regs *regs)
  
  static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs 
 *regs)
  {
 +/*
 + * ARMv8 model may not disable correctly the timer interrupt when

correctly disable

 + * Xen context switch to an idle vCPU. Therefore Xen may receive

context switches and may receive an unexpected timer interrupt

 + * timer interrupt.
 + */
 +if ( is_idle_vcpu(current) )
 +return;
 +
  current-arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0);
  WRITE_SYSREG32(current-arch.virt_timer.ctl | CNTx_CTL_MASK, 
 CNTV_CTL_EL0);
  vgic_vcpu_inject_irq(current, current-arch.virt_timer.irq);



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model

2014-11-27 Thread Stefano Stabellini
On Thu, 27 Nov 2014, Ian Campbell wrote:
 On Tue, 2014-11-25 at 17:44 +, Julien Grall wrote:
  ARMv8 model may not disable correctly the timer interrupt when Xen
 
 correct disable
 
  context switch to an idle vCPU. Therefore Xen may receive a spurious
 
 context switches and s/spurious/unexpected/ (since spurious has a
 specific meaning in the h/w which does not match what is happening here)
 
  timer interrupt. As the idle domain doesn't have vGIC, Xen will crash
  when trying to inject the interrupt with the following stack trace.
  
  (XEN)[00228388] _spin_lock_irqsave+0x28/0x94 (PC)
  (XEN)[00228380] _spin_lock_irqsave+0x20/0x94 (LR)
  (XEN)[00250510] vgic_vcpu_inject_irq+0x40/0x1b0
  (XEN)[0024bcd0] vtimer_interrupt+0x4c/0x54
  (XEN)[00247010] do_IRQ+0x1a4/0x220
  (XEN)[00244864] gic_interrupt+0x50/0xec
  (XEN)[0024fbac] do_trap_irq+0x20/0x2c
  (XEN)[00255240] hyp_irq+0x5c/0x60
  (XEN)[00241084] context_switch+0xb8/0xc4
  (XEN)[0022482c] schedule+0x684/0x6d0
  (XEN)[0022785c] __do_softirq+0xcc/0xe8
  (XEN)[002278d4] do_softirq+0x14/0x1c
  (XEN)[00240fac] idle_loop+0x134/0x154
  (XEN)[0024c160] start_secondary+0x14c/0x15c
  (XEN)[0001] 0001
  
  While we receive spurious virtual timer interrupt, this could be safely
  ignore for the time being. A proper fix need to be found for Xen 4.6.
  
  Signed-off-by: Julien Grall julien.gr...@linaro.org
 
 Acked-by: Ian Campbell ian.campb...@citrix.com
 
 Although I wonder if we should log, perhaps rate limited or only once.
 
 Also, I've some grammar nits (above and below) which I can fix on commit
 if there is no resend...
 
  
  ---
  
  This patch is a bug fix candidate for Xen 4.5. Any ARMv8 model may
  randomly crash when running Xen.
 
 CCing Konrad.
 
  This patch don't inject the virtual timer interrupt if the current VCPU
  is the idle one. Entering in this function with the idle VCPU is already
  a bug itself. For now, I think this patch is the safest way to resolve
  the problem.
  
  Meanwhile, I'm investigating with ARM to see wheter the bug comes from
  Xen or the model.

It is worth noting that there are no bad side effects of this change:
the vtimer_interrupt is always supposed to be received on non-idle
domains. As Julien wrote, the fact that we are receiving a
vtimer_interrupt in the idle_domain is a bug, one that probably comes
from the ARM model not emulating hardware correctly.


   xen/arch/arm/time.c | 8 
   1 file changed, 8 insertions(+)
  
  diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
  index a6436f1..83c74cb 100644
  --- a/xen/arch/arm/time.c
  +++ b/xen/arch/arm/time.c
  @@ -169,6 +169,14 @@ static void timer_interrupt(int irq, void *dev_id, 
  struct cpu_user_regs *regs)
   
   static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs 
  *regs)
   {
  +/*
  + * ARMv8 model may not disable correctly the timer interrupt when
 
 correctly disable
 
  + * Xen context switch to an idle vCPU. Therefore Xen may receive
 
 context switches and may receive an unexpected timer interrupt
 
  + * timer interrupt.
  + */
  +if ( is_idle_vcpu(current) )
  +return;
  +
   current-arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0);
   WRITE_SYSREG32(current-arch.virt_timer.ctl | CNTx_CTL_MASK, 
  CNTV_CTL_EL0);
   vgic_vcpu_inject_irq(current, current-arch.virt_timer.irq);
 
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model

2014-11-27 Thread Julien Grall
Hi Stefano,

On 27/11/14 10:51, Stefano Stabellini wrote:
 On Thu, 27 Nov 2014, Ian Campbell wrote:
 On Tue, 2014-11-25 at 17:44 +, Julien Grall wrote:
 ARMv8 model may not disable correctly the timer interrupt when Xen

 correct disable

 context switch to an idle vCPU. Therefore Xen may receive a spurious

 context switches and s/spurious/unexpected/ (since spurious has a
 specific meaning in the h/w which does not match what is happening here)

 timer interrupt. As the idle domain doesn't have vGIC, Xen will crash
 when trying to inject the interrupt with the following stack trace.

 (XEN)[00228388] _spin_lock_irqsave+0x28/0x94 (PC)
 (XEN)[00228380] _spin_lock_irqsave+0x20/0x94 (LR)
 (XEN)[00250510] vgic_vcpu_inject_irq+0x40/0x1b0
 (XEN)[0024bcd0] vtimer_interrupt+0x4c/0x54
 (XEN)[00247010] do_IRQ+0x1a4/0x220
 (XEN)[00244864] gic_interrupt+0x50/0xec
 (XEN)[0024fbac] do_trap_irq+0x20/0x2c
 (XEN)[00255240] hyp_irq+0x5c/0x60
 (XEN)[00241084] context_switch+0xb8/0xc4
 (XEN)[0022482c] schedule+0x684/0x6d0
 (XEN)[0022785c] __do_softirq+0xcc/0xe8
 (XEN)[002278d4] do_softirq+0x14/0x1c
 (XEN)[00240fac] idle_loop+0x134/0x154
 (XEN)[0024c160] start_secondary+0x14c/0x15c
 (XEN)[0001] 0001

 While we receive spurious virtual timer interrupt, this could be safely
 ignore for the time being. A proper fix need to be found for Xen 4.6.

 Signed-off-by: Julien Grall julien.gr...@linaro.org

 Acked-by: Ian Campbell ian.campb...@citrix.com

 Although I wonder if we should log, perhaps rate limited or only once.

 Also, I've some grammar nits (above and below) which I can fix on commit
 if there is no resend...


 ---

 This patch is a bug fix candidate for Xen 4.5. Any ARMv8 model may
 randomly crash when running Xen.

 CCing Konrad.

 This patch don't inject the virtual timer interrupt if the current VCPU
 is the idle one. Entering in this function with the idle VCPU is already
 a bug itself. For now, I think this patch is the safest way to resolve
 the problem.

 Meanwhile, I'm investigating with ARM to see wheter the bug comes from
 Xen or the model.
 
 It is worth noting that there are no bad side effects of this change:
 the vtimer_interrupt is always supposed to be received on non-idle
 domains. As Julien wrote, the fact that we are receiving a
 vtimer_interrupt in the idle_domain is a bug, one that probably comes
 from the ARM model not emulating hardware correctly.

ARM says:

The v8A ARM ARM says that the signal output will be disabled if , so
the signal will be set to 0.
However, how this is treated by the GIC depends on its configuration.

So I'm not so sure it's a model bug.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model

2014-11-27 Thread Julien Grall
Hi Ian,

On 27/11/14 10:40, Ian Campbell wrote:
 On Tue, 2014-11-25 at 17:44 +, Julien Grall wrote:
 ARMv8 model may not disable correctly the timer interrupt when Xen
 
 correct disable
 
 context switch to an idle vCPU. Therefore Xen may receive a spurious
 
 context switches and s/spurious/unexpected/ (since spurious has a
 specific meaning in the h/w which does not match what is happening here)
 
 timer interrupt. As the idle domain doesn't have vGIC, Xen will crash
 when trying to inject the interrupt with the following stack trace.

 (XEN)[00228388] _spin_lock_irqsave+0x28/0x94 (PC)
 (XEN)[00228380] _spin_lock_irqsave+0x20/0x94 (LR)
 (XEN)[00250510] vgic_vcpu_inject_irq+0x40/0x1b0
 (XEN)[0024bcd0] vtimer_interrupt+0x4c/0x54
 (XEN)[00247010] do_IRQ+0x1a4/0x220
 (XEN)[00244864] gic_interrupt+0x50/0xec
 (XEN)[0024fbac] do_trap_irq+0x20/0x2c
 (XEN)[00255240] hyp_irq+0x5c/0x60
 (XEN)[00241084] context_switch+0xb8/0xc4
 (XEN)[0022482c] schedule+0x684/0x6d0
 (XEN)[0022785c] __do_softirq+0xcc/0xe8
 (XEN)[002278d4] do_softirq+0x14/0x1c
 (XEN)[00240fac] idle_loop+0x134/0x154
 (XEN)[0024c160] start_secondary+0x14c/0x15c
 (XEN)[0001] 0001

 While we receive spurious virtual timer interrupt, this could be safely
 ignore for the time being. A proper fix need to be found for Xen 4.6.

 Signed-off-by: Julien Grall julien.gr...@linaro.org
 
 Acked-by: Ian Campbell ian.campb...@citrix.com
 
 Although I wonder if we should log, perhaps rate limited or only once.

I don't think the printk is necessary, receiving this unexpected
interrupt is harmless from the perspective that the guest will still
work when the vCPU will run again.

 Also, I've some grammar nits (above and below) which I can fix on commit
 if there is no resend...

Thanks.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.5] xen/arm: Fix virtual timer on ARMv8 Model

2014-11-25 Thread Julien Grall
ARMv8 model may not disable correctly the timer interrupt when Xen
context switch to an idle vCPU. Therefore Xen may receive a spurious
timer interrupt. As the idle domain doesn't have vGIC, Xen will crash
when trying to inject the interrupt with the following stack trace.

(XEN)[00228388] _spin_lock_irqsave+0x28/0x94 (PC)
(XEN)[00228380] _spin_lock_irqsave+0x20/0x94 (LR)
(XEN)[00250510] vgic_vcpu_inject_irq+0x40/0x1b0
(XEN)[0024bcd0] vtimer_interrupt+0x4c/0x54
(XEN)[00247010] do_IRQ+0x1a4/0x220
(XEN)[00244864] gic_interrupt+0x50/0xec
(XEN)[0024fbac] do_trap_irq+0x20/0x2c
(XEN)[00255240] hyp_irq+0x5c/0x60
(XEN)[00241084] context_switch+0xb8/0xc4
(XEN)[0022482c] schedule+0x684/0x6d0
(XEN)[0022785c] __do_softirq+0xcc/0xe8
(XEN)[002278d4] do_softirq+0x14/0x1c
(XEN)[00240fac] idle_loop+0x134/0x154
(XEN)[0024c160] start_secondary+0x14c/0x15c
(XEN)[0001] 0001

While we receive spurious virtual timer interrupt, this could be safely
ignore for the time being. A proper fix need to be found for Xen 4.6.

Signed-off-by: Julien Grall julien.gr...@linaro.org

---

This patch is a bug fix candidate for Xen 4.5. Any ARMv8 model may
randomly crash when running Xen.

This patch don't inject the virtual timer interrupt if the current VCPU
is the idle one. Entering in this function with the idle VCPU is already
a bug itself. For now, I think this patch is the safest way to resolve
the problem.

Meanwhile, I'm investigating with ARM to see wheter the bug comes from
Xen or the model.
---
 xen/arch/arm/time.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index a6436f1..83c74cb 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -169,6 +169,14 @@ static void timer_interrupt(int irq, void *dev_id, struct 
cpu_user_regs *regs)
 
 static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
 {
+/*
+ * ARMv8 model may not disable correctly the timer interrupt when
+ * Xen context switch to an idle vCPU. Therefore Xen may receive
+ * timer interrupt.
+ */
+if ( is_idle_vcpu(current) )
+return;
+
 current-arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0);
 WRITE_SYSREG32(current-arch.virt_timer.ctl | CNTx_CTL_MASK, CNTV_CTL_EL0);
 vgic_vcpu_inject_irq(current, current-arch.virt_timer.irq);
-- 
2.1.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel