Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Avi Kivity

On 09/13/2011 10:21 PM, Don Zickus wrote:

Or are you saying an NMI in an idle system will have the same %rip thus
falsely detecting a back-to-back NMI?




That's easy to avoid - insert an instruction zeroing the last nmi_rip 
somewhere before or after hlt.  It's always okay to execute such an 
instruction (outside the nmi handler itself), since nmi_rip is meant to 
detect a no instructions executed condition.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Don Zickus
On Wed, Sep 14, 2011 at 10:00:07AM +0300, Avi Kivity wrote:
 On 09/13/2011 10:21 PM, Don Zickus wrote:
 Or are you saying an NMI in an idle system will have the same %rip thus
 falsely detecting a back-to-back NMI?
 
 
 
 That's easy to avoid - insert an instruction zeroing the last
 nmi_rip somewhere before or after hlt.  It's always okay to execute
 such an instruction (outside the nmi handler itself), since nmi_rip
 is meant to detect a no instructions executed condition.

Ah. Like a touch_nmi_watchdog() type of thing.  Interesting.  I'll poke
around the idle code.  Need to instrument a reproducer first.

Thanks,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Andi Kleen
On Wed, Sep 14, 2011 at 10:00:07AM +0300, Avi Kivity wrote:
 On 09/13/2011 10:21 PM, Don Zickus wrote:
 Or are you saying an NMI in an idle system will have the same %rip thus
 falsely detecting a back-to-back NMI?
 
 
 
 That's easy to avoid - insert an instruction zeroing the last nmi_rip 
 somewhere before or after hlt.  It's always okay to execute such an 
 instruction (outside the nmi handler itself), since nmi_rip is meant to 
 detect a no instructions executed condition.

At least for classic hlt there is no simple after hlt because it's all
interrupt handlers and exceptions and everything else that can interrupt
combined.

It may work with newer MWAIT.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Avi Kivity

On 09/14/2011 05:49 PM, Andi Kleen wrote:

On Wed, Sep 14, 2011 at 10:00:07AM +0300, Avi Kivity wrote:
  On 09/13/2011 10:21 PM, Don Zickus wrote:
  Or are you saying an NMI in an idle system will have the same %rip thus
  falsely detecting a back-to-back NMI?
  
  

  That's easy to avoid - insert an instruction zeroing the last nmi_rip
  somewhere before or after hlt.  It's always okay to execute such an
  instruction (outside the nmi handler itself), since nmi_rip is meant to
  detect a no instructions executed condition.

At least for classic hlt there is no simple after hlt because it's all
interrupt handlers and exceptions and everything else that can interrupt
combined.


If an NMI hits in an interrupt handler, or in the after hlt section 
before the write-to-last-nmi-rip, then we'll see that %rip has changed.  
If it hits after the write-to-last-nmi-rip instruction (or in the hlt 
itself), then we'll also see that %rip has changed, due to the effect of 
that instruction.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Andi Kleen
 If an NMI hits in an interrupt handler, or in the after hlt section 
 before the write-to-last-nmi-rip, then we'll see that %rip has changed.  
 If it hits after the write-to-last-nmi-rip instruction (or in the hlt 
 itself), then we'll also see that %rip has changed, due to the effect of 
 that instruction.

It won't handle multiple NMIs in halt. I assume that's reasonable common.

-Andi
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Avi Kivity

On 09/14/2011 08:28 PM, Andi Kleen wrote:

  If an NMI hits in an interrupt handler, or in the after hlt section
  before the write-to-last-nmi-rip, then we'll see that %rip has changed.
  If it hits after the write-to-last-nmi-rip instruction (or in the hlt
  itself), then we'll also see that %rip has changed, due to the effect of
  that instruction.

It won't handle multiple NMIs in halt. I assume that's reasonable common.



Why not?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Andi Kleen
On Wed, Sep 14, 2011 at 10:26:21PM +0300, Avi Kivity wrote:
 On 09/14/2011 08:28 PM, Andi Kleen wrote:
   If an NMI hits in an interrupt handler, or in the after hlt section
   before the write-to-last-nmi-rip, then we'll see that %rip has changed.
   If it hits after the write-to-last-nmi-rip instruction (or in the hlt
   itself), then we'll also see that %rip has changed, due to the effect of
   that instruction.
 
 It won't handle multiple NMIs in halt. I assume that's reasonable common.
 
 
 Why not?

They all have the same original RIPs and there is no way to distingush
them.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-14 Thread Avi Kivity

On 09/14/2011 10:34 PM, Andi Kleen wrote:

On Wed, Sep 14, 2011 at 10:26:21PM +0300, Avi Kivity wrote:
  On 09/14/2011 08:28 PM, Andi Kleen wrote:
 If an NMI hits in an interrupt handler, or in the after hlt section
 before the write-to-last-nmi-rip, then we'll see that %rip has changed.
 If it hits after the write-to-last-nmi-rip instruction (or in the hlt
 itself), then we'll also see that %rip has changed, due to the effect of
 that instruction.
  
  It won't handle multiple NMIs in halt. I assume that's reasonable common.
  

  Why not?

They all have the same original RIPs and there is no way to distingush
them.



That's how we detect multiple NMIs.

1. First NMI is posted
2. NMI handler starts
3. 2nd NMI posted, queued
4. First NMI source handled
5. IRET
6. Queued NMI hits the core
7. back-to-back NMI detected (same rip)
8. Second (and third...) NMI source handled
9. Execution continues.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-13 Thread Don Zickus
On Wed, Sep 07, 2011 at 08:09:37PM +0300, Avi Kivity wrote:
 But then the downside
 here is we accidentally handle an NMI that was latched.  This would cause
 a 'Dazed on confused' message as that NMI was already handled by the
 previous NMI.
 
 We are working on an algorithm to detect this condition and flag it
 (nothing complicated).  But it may never be perfect.
 
 On the other hand, what else are we going to do with an edge-triggered
 shared interrupt line?
 
 
 How about, during NMI, save %rip to a per-cpu variable.  Handle just
 one cause.  If, on the next NMI, we hit the same %rip, assume
 back-to-back NMI has occured and now handle all causes.

So I got around to implementing this and it seems to work great.  The back
to back NMIs are detected properly using the %rip and that info is passed to
the NMI notifier.  That info is used to determine if only the first
handler to report 'handled' is executed or _all_ the handlers are
executed.

I think all the 'unknown' NMIs I generated with various perf runs have
disappeared.  I'll post a new version of my nmi notifier rewrite soon.

Thanks for the great suggestion Avi!

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-13 Thread Andi Kleen
 So I got around to implementing this and it seems to work great.  The back
 to back NMIs are detected properly using the %rip and that info is passed to
 the NMI notifier.  That info is used to determine if only the first
 handler to report 'handled' is executed or _all_ the handlers are
 executed.
 
 I think all the 'unknown' NMIs I generated with various perf runs have
 disappeared.  I'll post a new version of my nmi notifier rewrite soon.

This will fail when the system is idle.

-Andi

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-13 Thread Don Zickus
On Tue, Sep 13, 2011 at 09:03:20PM +0200, Andi Kleen wrote:
  So I got around to implementing this and it seems to work great.  The back
  to back NMIs are detected properly using the %rip and that info is passed to
  the NMI notifier.  That info is used to determine if only the first
  handler to report 'handled' is executed or _all_ the handlers are
  executed.
  
  I think all the 'unknown' NMIs I generated with various perf runs have
  disappeared.  I'll post a new version of my nmi notifier rewrite soon.
 
 This will fail when the system is idle.

Oh one other thing I forgot to mention is that an NMI handler has to
return a value greater than 1, meaning that it handle multiple NMI events
during this NMI to even enable the 'NMI swallowing' algorithm.

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-13 Thread Don Zickus
On Tue, Sep 13, 2011 at 09:03:20PM +0200, Andi Kleen wrote:
  So I got around to implementing this and it seems to work great.  The back
  to back NMIs are detected properly using the %rip and that info is passed to
  the NMI notifier.  That info is used to determine if only the first
  handler to report 'handled' is executed or _all_ the handlers are
  executed.
  
  I think all the 'unknown' NMIs I generated with various perf runs have
  disappeared.  I'll post a new version of my nmi notifier rewrite soon.
 
 This will fail when the system is idle.

Heh.  I don't think I explained what I was doing properly.

I had a bunch of perf runs going simultaneously on my Core2quad.  Probably
generated around 40,000 NMIs in a few minutes.  I think around a 1000 or
so detected back-to-back NMIs.

With the current NMI detection algorithm in perf to determine back-to-back
NMIs, I can usually use the above scenario to generate an 'unknown' NMI.
Actually numerous ones before the box freezes. It is a false positive.

With my current code and Avi's idea, all those disappeared as they are now
'swallowed' by the algorithm.  That seemed like a positive.

However, being cautious, I decided to instrument lkdtm to inject NMIs to
force unknown NMI conditions
(apic-send_IPI_mask(cpumask_of(smp_processor_id(), NMI_VECTOR)))

I tried to inject the NMIs while my trace_printk buffer was flooding my
screen with 'back-to-back NMIs detected'.  I did this 4 or 5 times and
every single one of them were detected as an 'unknown' NMI.  So this was
good too, in the sense it was not swallowing 'real' unknown NMIs.

Does that make sense?

Or are you saying an NMI in an idle system will have the same %rip thus
falsely detecting a back-to-back NMI?

Cheers,
Don

 
 -Andi
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-13 Thread Andi Kleen
 Or are you saying an NMI in an idle system will have the same %rip thus
 falsely detecting a back-to-back NMI?

Yup.

Another problem is very long running instructions, like WBINVD and some others.
If there's a high frequency NMI it may well hit multiple times in a single
 instance.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-13 Thread Andi Kleen
On Tue, Sep 13, 2011 at 04:53:18PM -0400, Don Zickus wrote:
 On Tue, Sep 13, 2011 at 09:58:38PM +0200, Andi Kleen wrote:
   Or are you saying an NMI in an idle system will have the same %rip thus
   falsely detecting a back-to-back NMI?
  
  Yup.
 
 Hmm.  That sucks.  Is there another register that can be used in
 conjunction to seperate it, like sp or something?  Or we can we assume

Not that I know of.

 than an idle cpu isn't doing much for local NMI IPIs and that the only
 NMIs that would interrupt it would be external ones?

There can be still activity on the idle system, e.g. SMM or some 
Hypervisor in the background. If you profile events those might trigger 
samples, but they will be all accounted to the MWAIT.

  Another problem is very long running instructions, like WBINVD and some 
  others.
  If there's a high frequency NMI it may well hit multiple times in a single
   instance.
 
 I thought NMIs happen on instruction boundaries, maybe not.

Yes, but there may be multiple queued up when the instruction has finished
executing. So you get multiple at the boundary.

 Honestly, our current implementation would probably be tripped up with
 those examples too, so I don't think I am making things worse (assuming
 the only high frequency NMI is coming from perf).

I wish perf/oprofile would just stop using NMIs.  The interrupt off regions
are getting smaller and smaller and they can be profiled in a limited way
using PEBS anyways. Or maybe make it a knob with default to off.

-Andi

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-11 Thread Avi Kivity

On 09/08/2011 08:29 PM, Jeremy Fitzhardinge wrote:

  I don't think it's that expensive, especially compared to the
  double-context-switch and vmexit of the spinner going to sleep.  On
  AMD we do have to take an extra vmexit (on IRET) though.

Fair enough - so if the vcpu blocks itself, it ends up being rescheduled
in the NMI handler, which then returns to the lock slowpath.  And if its
a normal hlt, then you can also take interrupts if they're enabled while
spinning.


Yes.  To be clear, just execute 'hlt' and inherit the interrupt enable 
flag from the environment.



And if you get nested NMIs (since you can get multiple spurious kicks,
or from other NMI sources), then one NMI will get latched and any others
will get dropped?


While we're in the NMI handler, any further NMIs will be collapsed and 
queued (so one NMI can be in service and just one other queued behind 
it).  We can detect this condition by checking %rip on stack.




  Well we could have a specialized sleep/wakeup hypercall pair like Xen,
  but I'd like to avoid it if at all possible.

Yeah, that's something that just falls out of the existing event channel
machinery, so it isn't something that I specifically added.  But it does
mean that you simply end up with a hypercall returning on kick, with no
real complexities.


It also has to return on interrupt, MNI, INIT etc.  No real 
complexities is a meaningless phrase on x86, though it is fertile 
ground for math puns.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-08 Thread Avi Kivity

On 09/07/2011 10:09 PM, Jeremy Fitzhardinge wrote:

On 09/07/2011 10:41 AM, Avi Kivity wrote:
  Hm, I'm interested to know what you're thinking in more detail.  Can you
  leave an NMI pending before you block in the same way you can with
  sti;halt with normal interrupts?


  Nope.  But you can do

 if (regs-rip in critical section)
 regs-rip = after_halt;

  and effectively emulate it.  The critical section is something like

  critical_section_start:
  if (woken_up)
  goto critical_section_end;
  hlt
  critical_section_end:

Hm.  It's a pity you have to deliver an actual interrupt to implement
the kick though.


I don't think it's that expensive, especially compared to the 
double-context-switch and vmexit of the spinner going to sleep.  On AMD 
we do have to take an extra vmexit (on IRET) though.




  I was thinking you might want to do something with monitor/mwait to
  implement the blocking/kick ops. (Handwave)


  monitor/mwait are incredibly expensive to virtualize since they
  require write-protecting a page, IPIs flying everywhere and flushing
  tlbs, not to mention my lovely hugepages being broken up mercilessly.

Or what about a futex-like hypercall?



Well we could have a specialized sleep/wakeup hypercall pair like Xen, 
but I'd like to avoid it if at all possible.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-08 Thread Jeremy Fitzhardinge
On 09/08/2011 12:51 AM, Avi Kivity wrote:
 On 09/07/2011 10:09 PM, Jeremy Fitzhardinge wrote:
 On 09/07/2011 10:41 AM, Avi Kivity wrote:
   Hm, I'm interested to know what you're thinking in more detail. 
 Can you
   leave an NMI pending before you block in the same way you can with
   sti;halt with normal interrupts?
 
 
   Nope.  But you can do
 
  if (regs-rip in critical section)
  regs-rip = after_halt;
 
   and effectively emulate it.  The critical section is something like
 
   critical_section_start:
   if (woken_up)
   goto critical_section_end;
   hlt
   critical_section_end:

 Hm.  It's a pity you have to deliver an actual interrupt to implement
 the kick though.

 I don't think it's that expensive, especially compared to the
 double-context-switch and vmexit of the spinner going to sleep.  On
 AMD we do have to take an extra vmexit (on IRET) though.

Fair enough - so if the vcpu blocks itself, it ends up being rescheduled
in the NMI handler, which then returns to the lock slowpath.  And if its
a normal hlt, then you can also take interrupts if they're enabled while
spinning.

And if you get nested NMIs (since you can get multiple spurious kicks,
or from other NMI sources), then one NMI will get latched and any others
will get dropped?

 Well we could have a specialized sleep/wakeup hypercall pair like Xen,
 but I'd like to avoid it if at all possible.

Yeah, that's something that just falls out of the existing event channel
machinery, so it isn't something that I specifically added.  But it does
mean that you simply end up with a hypercall returning on kick, with no
real complexities.

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Avi Kivity

On 09/07/2011 06:56 PM, Don Zickus wrote:


  And hope that no other NMI was generated while we're handling this
  one.  It's a little... fragile?

No.  If another NMI is generated while we are processing the current one
it should get latched.  Upon completion of the current one, the cpu should
jump right back into the nmi exception routine again.  The only downside
is when multiple NMIs come in during the processing of the current one.
Only one can be latched, so the others get dropped.


Ah, yes, I remember now.


But we are addressing
that.



May I ask how?  Detecting a back-to-back NMI?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Avi Kivity

On 09/07/2011 04:44 PM, Don Zickus wrote:


  Is there a way to tell whether an NMI was internally or externally
  generated?

  I don't think so, especially as two or more NMIs can be coalesced.
  So any NMI received on this first cpu has to check the NMI reason
  port?

Well we cheat and execute all the nmi handlers first.  If they come back
as handled, we skip the check for the external NMI.


And hope that no other NMI was generated while we're handling this one.  
It's a little... fragile?



But you are right, other than checking the reason port, there isn't a way
to determine if an NMI is internally or externally generated.


Ouch.




  
 But on the other hand, I don't really care if you can say that this path
 will never be called in a virtual machine.
  
  Does virtual machines support hot remove of cpus?  Probably not
  considering bare-metal barely supports it.
  

  They do.

But vcpus probably don't have the notion of a bsp cpu, so perhaps virtual
machines can get away with it easier?  (I don't know enough about the hot
cpu remove code to really explain it, just enough to know it can cause
problems and people are trying to address it).



The concept of a bsp exists in exactly the same way as on real hardware.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Don Zickus
On Wed, Sep 07, 2011 at 06:11:14PM +0300, Avi Kivity wrote:
 On 09/07/2011 04:44 PM, Don Zickus wrote:
 
   Is there a way to tell whether an NMI was internally or externally
   generated?
 
   I don't think so, especially as two or more NMIs can be coalesced.
   So any NMI received on this first cpu has to check the NMI reason
   port?
 
 Well we cheat and execute all the nmi handlers first.  If they come back
 as handled, we skip the check for the external NMI.
 
 And hope that no other NMI was generated while we're handling this
 one.  It's a little... fragile?

No.  If another NMI is generated while we are processing the current one
it should get latched.  Upon completion of the current one, the cpu should
jump right back into the nmi exception routine again.  The only downside
is when multiple NMIs come in during the processing of the current one.
Only one can be latched, so the others get dropped.  But we are addressing
that.

Cheers,
Don

 
 But you are right, other than checking the reason port, there isn't a way
 to determine if an NMI is internally or externally generated.
 
 Ouch.
 
 
 
   
  But on the other hand, I don't really care if you can say that this 
  path
  will never be called in a virtual machine.
   
   Does virtual machines support hot remove of cpus?  Probably not
   considering bare-metal barely supports it.
   
 
   They do.
 
 But vcpus probably don't have the notion of a bsp cpu, so perhaps virtual
 machines can get away with it easier?  (I don't know enough about the hot
 cpu remove code to really explain it, just enough to know it can cause
 problems and people are trying to address it).
 
 
 The concept of a bsp exists in exactly the same way as on real hardware.
 
 -- 
 error compiling committee.c: too many arguments to function
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Don Zickus
On Wed, Sep 07, 2011 at 07:13:58AM +0300, Avi Kivity wrote:
 On 09/06/2011 09:27 PM, Don Zickus wrote:
 On Tue, Sep 06, 2011 at 11:07:26AM -0700, Jeremy Fitzhardinge wrote:
 But, erm, does that even make sense?  I'm assuming the NMI reason port
 tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
 there's only a single reason port, then doesn't that mean that either 
  1)
 they all got the NMI for the same reason, or 2) having a single port 
  is
 inherently racy?  How does the locking actually work there?
 The reason port is for an external/system NMI.  All the IPI-NMI don't 
  need
 to access this register to process their handlers, ie perf.  I think in
 general the IOAPIC is configured to deliver the external NMI to one 
  cpu,
 usually the bsp cpu.  However, there has been a slow movement to free 
  the
 bsp cpu from exceptions like this to allow one to eventually hot-swap 
  the
 bsp cpu.  The spin locks in that code were an attempt to be more 
  abstract
 about who really gets the external NMI.  Of course SGI's box is setup 
  to
 deliver an external NMI to all cpus to dump the stack when the system
 isn't behaving.
   
 This is a very low usage NMI (in fact almost all cases lead to loud
 console messages).
   
 Hope that clears up some of the confusion.
 
   Hm, not really.
 
   What does it mean if two CPUs go down that path?  Should one do some NMI
   processing while the other waits around for it to finish, and then do
   some NMI processing on its own?
 
 Well the time the second one gets to the external NMI it should have been
 cleared by the first cpu, which would of course lead to the second cpu
 causing a 'Dazed and confused' message.  But on most x86 machines only one
 cpu should be routed the external NMI.  Though there is an SGI box that is
 designed to send an external NMI to all of its cpus.
 
 Is there a way to tell whether an NMI was internally or externally
 generated?
 
 I don't think so, especially as two or more NMIs can be coalesced.
 So any NMI received on this first cpu has to check the NMI reason
 port?

Well we cheat and execute all the nmi handlers first.  If they come back
as handled, we skip the check for the external NMI.

But you are right, other than checking the reason port, there isn't a way
to determine if an NMI is internally or externally generated.

 
 
   But on the other hand, I don't really care if you can say that this path
   will never be called in a virtual machine.
 
 Does virtual machines support hot remove of cpus?  Probably not
 considering bare-metal barely supports it.
 
 
 They do.

But vcpus probably don't have the notion of a bsp cpu, so perhaps virtual
machines can get away with it easier?  (I don't know enough about the hot
cpu remove code to really explain it, just enough to know it can cause
problems and people are trying to address it).

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Don Zickus
On Wed, Sep 07, 2011 at 07:25:24PM +0300, Avi Kivity wrote:
 On 09/07/2011 06:56 PM, Don Zickus wrote:
 
   And hope that no other NMI was generated while we're handling this
   one.  It's a little... fragile?
 
 No.  If another NMI is generated while we are processing the current one
 it should get latched.  Upon completion of the current one, the cpu should
 jump right back into the nmi exception routine again.  The only downside
 is when multiple NMIs come in during the processing of the current one.
 Only one can be latched, so the others get dropped.
 
 Ah, yes, I remember now.
 
 But we are addressing
 that.
 
 
 May I ask how?  Detecting a back-to-back NMI?

Pretty boring actually.  Currently we execute an NMI handler until one of
them returns handled.  Then we stop.  This may cause us to miss an NMI in
the case of multiple NMIs at once.  Now we are changing it to execute
_all_ the handlers to make sure we didn't miss one.  But then the downside
here is we accidentally handle an NMI that was latched.  This would cause
a 'Dazed on confused' message as that NMI was already handled by the
previous NMI.

We are working on an algorithm to detect this condition and flag it
(nothing complicated).  But it may never be perfect.

On the other hand, what else are we going to do with an edge-triggered
shared interrupt line?

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Avi Kivity

On 09/07/2011 07:52 PM, Don Zickus wrote:


  May I ask how?  Detecting a back-to-back NMI?

Pretty boring actually.  Currently we execute an NMI handler until one of
them returns handled.  Then we stop.  This may cause us to miss an NMI in
the case of multiple NMIs at once.  Now we are changing it to execute
_all_ the handlers to make sure we didn't miss one.


That's going to be pretty bad for kvm - those handlers become a lot more 
expensive since they involve reading MSRs.  Even worse if we start using 
NMIs as a wakeup for pv spinlocks as provided by this patchset.



But then the downside
here is we accidentally handle an NMI that was latched.  This would cause
a 'Dazed on confused' message as that NMI was already handled by the
previous NMI.

We are working on an algorithm to detect this condition and flag it
(nothing complicated).  But it may never be perfect.

On the other hand, what else are we going to do with an edge-triggered
shared interrupt line?



How about, during NMI, save %rip to a per-cpu variable.  Handle just one 
cause.  If, on the next NMI, we hit the same %rip, assume back-to-back 
NMI has occured and now handle all causes.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Jeremy Fitzhardinge
On 09/07/2011 10:09 AM, Avi Kivity wrote:
 On 09/07/2011 07:52 PM, Don Zickus wrote:
 
   May I ask how?  Detecting a back-to-back NMI?

 Pretty boring actually.  Currently we execute an NMI handler until
 one of
 them returns handled.  Then we stop.  This may cause us to miss an
 NMI in
 the case of multiple NMIs at once.  Now we are changing it to execute
 _all_ the handlers to make sure we didn't miss one.

 That's going to be pretty bad for kvm - those handlers become a lot
 more expensive since they involve reading MSRs.

How often are you going to get NMIs in a kvm guest?

   Even worse if we start using NMIs as a wakeup for pv spinlocks as
 provided by this patchset.

Hm, I'm interested to know what you're thinking in more detail.  Can you
leave an NMI pending before you block in the same way you can with
sti;halt with normal interrupts?

I was thinking you might want to do something with monitor/mwait to
implement the blocking/kick ops. (Handwave)

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Don Zickus
On Wed, Sep 07, 2011 at 08:09:37PM +0300, Avi Kivity wrote:
 On 09/07/2011 07:52 PM, Don Zickus wrote:
 
   May I ask how?  Detecting a back-to-back NMI?
 
 Pretty boring actually.  Currently we execute an NMI handler until one of
 them returns handled.  Then we stop.  This may cause us to miss an NMI in
 the case of multiple NMIs at once.  Now we are changing it to execute
 _all_ the handlers to make sure we didn't miss one.
 
 That's going to be pretty bad for kvm - those handlers become a lot
 more expensive since they involve reading MSRs.  Even worse if we
 start using NMIs as a wakeup for pv spinlocks as provided by this
 patchset.

Oh.

 
 But then the downside
 here is we accidentally handle an NMI that was latched.  This would cause
 a 'Dazed on confused' message as that NMI was already handled by the
 previous NMI.
 
 We are working on an algorithm to detect this condition and flag it
 (nothing complicated).  But it may never be perfect.
 
 On the other hand, what else are we going to do with an edge-triggered
 shared interrupt line?
 
 
 How about, during NMI, save %rip to a per-cpu variable.  Handle just
 one cause.  If, on the next NMI, we hit the same %rip, assume
 back-to-back NMI has occured and now handle all causes.

I had a similar idea a couple of months ago while debugging a continuous
flow of back-to-back NMIs from a stress-test perf application and I
couldn't get it to work.  But let me try it again, because it does make
sense as an optimization.

Thanks,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Avi Kivity

On 09/07/2011 08:17 PM, Jeremy Fitzhardinge wrote:

On 09/07/2011 10:09 AM, Avi Kivity wrote:
  On 09/07/2011 07:52 PM, Don Zickus wrote:
  
 May I ask how?  Detecting a back-to-back NMI?

  Pretty boring actually.  Currently we execute an NMI handler until
  one of
  them returns handled.  Then we stop.  This may cause us to miss an
  NMI in
  the case of multiple NMIs at once.  Now we are changing it to execute
  _all_ the handlers to make sure we didn't miss one.

  That's going to be pretty bad for kvm - those handlers become a lot
  more expensive since they involve reading MSRs.

How often are you going to get NMIs in a kvm guest?


We'll soon have the perf-based watchdog firing every 60s worth of 
instructions or so.  But if we implement your new kick pvop using NMI 
then it can be _very_ often.




Even worse if we start using NMIs as a wakeup for pv spinlocks as
  provided by this patchset.

Hm, I'm interested to know what you're thinking in more detail.  Can you
leave an NMI pending before you block in the same way you can with
sti;halt with normal interrupts?


Nope.  But you can do

   if (regs-rip in critical section)
   regs-rip = after_halt;

and effectively emulate it.  The critical section is something like

critical_section_start:
if (woken_up)
goto critical_section_end;
hlt
critical_section_end:



I was thinking you might want to do something with monitor/mwait to
implement the blocking/kick ops. (Handwave)



monitor/mwait are incredibly expensive to virtualize since they require 
write-protecting a page, IPIs flying everywhere and flushing tlbs, not 
to mention my lovely hugepages being broken up mercilessly.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Avi Kivity

On 09/07/2011 08:21 PM, Don Zickus wrote:


  How about, during NMI, save %rip to a per-cpu variable.  Handle just
  one cause.  If, on the next NMI, we hit the same %rip, assume
  back-to-back NMI has occured and now handle all causes.

I had a similar idea a couple of months ago while debugging a continuous
flow of back-to-back NMIs from a stress-test perf application and I
couldn't get it to work.  But let me try it again, because it does make
sense as an optimization.




Great, thanks.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-07 Thread Jeremy Fitzhardinge
On 09/07/2011 10:41 AM, Avi Kivity wrote:
 Hm, I'm interested to know what you're thinking in more detail.  Can you
 leave an NMI pending before you block in the same way you can with
 sti;halt with normal interrupts?


 Nope.  But you can do

if (regs-rip in critical section)
regs-rip = after_halt;

 and effectively emulate it.  The critical section is something like

 critical_section_start:
 if (woken_up)
 goto critical_section_end;
 hlt
 critical_section_end:

Hm.  It's a pity you have to deliver an actual interrupt to implement
the kick though.


 I was thinking you might want to do something with monitor/mwait to
 implement the blocking/kick ops. (Handwave)


 monitor/mwait are incredibly expensive to virtualize since they
 require write-protecting a page, IPIs flying everywhere and flushing
 tlbs, not to mention my lovely hugepages being broken up mercilessly.

Or what about a futex-like hypercall?

J

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-06 Thread Don Zickus
On Fri, Sep 02, 2011 at 02:50:53PM -0700, Jeremy Fitzhardinge wrote:
 On 09/02/2011 01:47 PM, Peter Zijlstra wrote:
  On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
  I know that its generally considered bad form, but there's at least one
  spinlock that's only taken from NMI context and thus hasn't got any
  deadlock potential.
  Which one? 
  arch/x86/kernel/traps.c:nmi_reason_lock
 
  It serializes NMI access to the NMI reason port across CPUs.
 
 Ah, OK.  Well, that will never happen in a PV Xen guest.  But PV
 ticketlocks are equally applicable to an HVM Xen domain (and KVM guest),
 so I guess there's at least some chance there could be a virtual
 emulated NMI.  Maybe?  Does qemu do that kind of thing?
 
 But, erm, does that even make sense?  I'm assuming the NMI reason port
 tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
 there's only a single reason port, then doesn't that mean that either 1)
 they all got the NMI for the same reason, or 2) having a single port is
 inherently racy?  How does the locking actually work there?

The reason port is for an external/system NMI.  All the IPI-NMI don't need
to access this register to process their handlers, ie perf.  I think in
general the IOAPIC is configured to deliver the external NMI to one cpu,
usually the bsp cpu.  However, there has been a slow movement to free the
bsp cpu from exceptions like this to allow one to eventually hot-swap the
bsp cpu.  The spin locks in that code were an attempt to be more abstract
about who really gets the external NMI.  Of course SGI's box is setup to
deliver an external NMI to all cpus to dump the stack when the system
isn't behaving.

This is a very low usage NMI (in fact almost all cases lead to loud
console messages).

Hope that clears up some of the confusion.

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-06 Thread Jeremy Fitzhardinge
On 09/06/2011 08:14 AM, Don Zickus wrote:
 On Fri, Sep 02, 2011 at 02:50:53PM -0700, Jeremy Fitzhardinge wrote:
 On 09/02/2011 01:47 PM, Peter Zijlstra wrote:
 On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
 I know that its generally considered bad form, but there's at least one
 spinlock that's only taken from NMI context and thus hasn't got any
 deadlock potential.
 Which one? 
 arch/x86/kernel/traps.c:nmi_reason_lock

 It serializes NMI access to the NMI reason port across CPUs.
 Ah, OK.  Well, that will never happen in a PV Xen guest.  But PV
 ticketlocks are equally applicable to an HVM Xen domain (and KVM guest),
 so I guess there's at least some chance there could be a virtual
 emulated NMI.  Maybe?  Does qemu do that kind of thing?

 But, erm, does that even make sense?  I'm assuming the NMI reason port
 tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
 there's only a single reason port, then doesn't that mean that either 1)
 they all got the NMI for the same reason, or 2) having a single port is
 inherently racy?  How does the locking actually work there?
 The reason port is for an external/system NMI.  All the IPI-NMI don't need
 to access this register to process their handlers, ie perf.  I think in
 general the IOAPIC is configured to deliver the external NMI to one cpu,
 usually the bsp cpu.  However, there has been a slow movement to free the
 bsp cpu from exceptions like this to allow one to eventually hot-swap the
 bsp cpu.  The spin locks in that code were an attempt to be more abstract
 about who really gets the external NMI.  Of course SGI's box is setup to
 deliver an external NMI to all cpus to dump the stack when the system
 isn't behaving.

 This is a very low usage NMI (in fact almost all cases lead to loud
 console messages).

 Hope that clears up some of the confusion.

Hm, not really.

What does it mean if two CPUs go down that path?  Should one do some NMI
processing while the other waits around for it to finish, and then do
some NMI processing on its own?

It sounds like that could only happen if you reroute NMI from one CPU to
another while the first CPU is actually in the middle of processing an
NMI - in which case, shouldn't the code doing the re-routing be taking
the spinlock?

Or perhaps a spinlock isn't the right primitive to use at all?  Couldn't
the second CPU just set a flag/counter (using something like an atomic
add/cmpxchg/etc) to make the first CPU process the second NMI?

But on the other hand, I don't really care if you can say that this path
will never be called in a virtual machine.

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-06 Thread Don Zickus
On Tue, Sep 06, 2011 at 11:07:26AM -0700, Jeremy Fitzhardinge wrote:
  But, erm, does that even make sense?  I'm assuming the NMI reason port
  tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
  there's only a single reason port, then doesn't that mean that either 1)
  they all got the NMI for the same reason, or 2) having a single port is
  inherently racy?  How does the locking actually work there?
  The reason port is for an external/system NMI.  All the IPI-NMI don't need
  to access this register to process their handlers, ie perf.  I think in
  general the IOAPIC is configured to deliver the external NMI to one cpu,
  usually the bsp cpu.  However, there has been a slow movement to free the
  bsp cpu from exceptions like this to allow one to eventually hot-swap the
  bsp cpu.  The spin locks in that code were an attempt to be more abstract
  about who really gets the external NMI.  Of course SGI's box is setup to
  deliver an external NMI to all cpus to dump the stack when the system
  isn't behaving.
 
  This is a very low usage NMI (in fact almost all cases lead to loud
  console messages).
 
  Hope that clears up some of the confusion.
 
 Hm, not really.
 
 What does it mean if two CPUs go down that path?  Should one do some NMI
 processing while the other waits around for it to finish, and then do
 some NMI processing on its own?

Well the time the second one gets to the external NMI it should have been
cleared by the first cpu, which would of course lead to the second cpu
causing a 'Dazed and confused' message.  But on most x86 machines only one
cpu should be routed the external NMI.  Though there is an SGI box that is
designed to send an external NMI to all of its cpus.

 
 It sounds like that could only happen if you reroute NMI from one CPU to
 another while the first CPU is actually in the middle of processing an
 NMI - in which case, shouldn't the code doing the re-routing be taking
 the spinlock?

Perhaps, but like I said it is a slow transition because most people don't
have the hardware to test this (nor does it work due to other
limitations).

 
 Or perhaps a spinlock isn't the right primitive to use at all?  Couldn't
 the second CPU just set a flag/counter (using something like an atomic
 add/cmpxchg/etc) to make the first CPU process the second NMI?

Might be a smarter approach.  Like I said it is hard to test without
functioning hardware. :-(

 
 But on the other hand, I don't really care if you can say that this path
 will never be called in a virtual machine.

Does virtual machines support hot remove of cpus?  Probably not
considering bare-metal barely supports it.

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-06 Thread Jeremy Fitzhardinge
On 09/06/2011 11:27 AM, Don Zickus wrote:
 But on the other hand, I don't really care if you can say that this path
 will never be called in a virtual machine.
 Does virtual machines support hot remove of cpus?  Probably not
 considering bare-metal barely supports it.

The only reason you'd want to is to add/remove VCPUs as a mechanism of
resource control, so if you were removing a VCPU it wouldn't matter much
which one you choose.  In other words, there's no reason you'd ever need
to remove the BSP in favour of one of the other CPUs.

Anyway, I'm not going to lose any sleep over this issue.

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-06 Thread Avi Kivity

On 09/06/2011 09:27 PM, Don Zickus wrote:

On Tue, Sep 06, 2011 at 11:07:26AM -0700, Jeremy Fitzhardinge wrote:
But, erm, does that even make sense?  I'm assuming the NMI reason port
tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
there's only a single reason port, then doesn't that mean that either 1)
they all got the NMI for the same reason, or 2) having a single port is
inherently racy?  How does the locking actually work there?
The reason port is for an external/system NMI.  All the IPI-NMI don't need
to access this register to process their handlers, ie perf.  I think in
general the IOAPIC is configured to deliver the external NMI to one cpu,
usually the bsp cpu.  However, there has been a slow movement to free the
bsp cpu from exceptions like this to allow one to eventually hot-swap the
bsp cpu.  The spin locks in that code were an attempt to be more abstract
about who really gets the external NMI.  Of course SGI's box is setup to
deliver an external NMI to all cpus to dump the stack when the system
isn't behaving.
  
This is a very low usage NMI (in fact almost all cases lead to loud
console messages).
  
Hope that clears up some of the confusion.

  Hm, not really.

  What does it mean if two CPUs go down that path?  Should one do some NMI
  processing while the other waits around for it to finish, and then do
  some NMI processing on its own?

Well the time the second one gets to the external NMI it should have been
cleared by the first cpu, which would of course lead to the second cpu
causing a 'Dazed and confused' message.  But on most x86 machines only one
cpu should be routed the external NMI.  Though there is an SGI box that is
designed to send an external NMI to all of its cpus.


Is there a way to tell whether an NMI was internally or externally 
generated?


I don't think so, especially as two or more NMIs can be coalesced.  So 
any NMI received on this first cpu has to check the NMI reason port?




  But on the other hand, I don't really care if you can say that this path
  will never be called in a virtual machine.

Does virtual machines support hot remove of cpus?  Probably not
considering bare-metal barely supports it.



They do.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-05 Thread Stefano Stabellini
CC'ing Keir.

On Fri, 2 Sep 2011, Jeremy Fitzhardinge wrote:
 On 09/02/2011 01:47 PM, Peter Zijlstra wrote:
  On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
  I know that its generally considered bad form, but there's at least one
  spinlock that's only taken from NMI context and thus hasn't got any
  deadlock potential.
  Which one? 
  arch/x86/kernel/traps.c:nmi_reason_lock
 
  It serializes NMI access to the NMI reason port across CPUs.
 
 Ah, OK.  Well, that will never happen in a PV Xen guest.  But PV
 ticketlocks are equally applicable to an HVM Xen domain (and KVM guest),
 so I guess there's at least some chance there could be a virtual
 emulated NMI.  Maybe?  Does qemu do that kind of thing?

Xen knows how to inject NMIs to HVM guests, even though I am not sure
if it is actually done in practice at the moment.

However digging into the implementation details, it looks like virtual
NMIs are not injected if blocking-by-STI (or blocking-by-MOV-SS), so we
should be fine, even though I don't know if you actually want to rely on
this:

/*
 * We can only inject an NMI if no blocking by MOV SS (also, depending on
 * implementation, if no blocking by STI). If pin-based 'virtual NMIs'
 * control is specified then the NMI-blocking interruptibility flag is
 * also checked. The 'virtual NMI pending' control (available only in
 * conjunction with 'virtual NMIs') causes a VM exit when all these checks
 * succeed. It will exit immediately after VM entry if the checks succeed
 * at that point.
 * 
 * Because a processor may or may not check blocking-by-STI when injecting
 * a virtual NMI, it will be necessary to convert that to block-by-MOV-SS
 * before specifying the 'virtual NMI pending' control. Otherwise we could
 * enter an infinite loop where we check blocking-by-STI in software and
 * thus delay delivery of a virtual NMI, but the processor causes immediate
 * VM exit because it does not check blocking-by-STI.
 */
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-05 Thread Avi Kivity

On 09/03/2011 12:50 AM, Jeremy Fitzhardinge wrote:

On 09/02/2011 01:47 PM, Peter Zijlstra wrote:
  On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
  I know that its generally considered bad form, but there's at least one
  spinlock that's only taken from NMI context and thus hasn't got any
  deadlock potential.
  Which one?
  arch/x86/kernel/traps.c:nmi_reason_lock

  It serializes NMI access to the NMI reason port across CPUs.

Ah, OK.  Well, that will never happen in a PV Xen guest.  But PV
ticketlocks are equally applicable to an HVM Xen domain (and KVM guest),
so I guess there's at least some chance there could be a virtual
emulated NMI.  Maybe?  Does qemu do that kind of thing?


kvm does.  In fact, I want to use 'hlt' for blocking vcpus in the 
slowpath and an NMI IPI for waking them up.  That's not going to work in 
an NMI, but I guess I can replace the 'hlt' with a 'pause' if we're in 
an NMI, and just spin there.


btw, doing a race-free NMI wakeup is going to be interesting - we'll 
have to look at %rip and see if is just before our hlt, since there's no 
magic sti; hlt sequence we can use.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Peter Zijlstra
On Thu, 2011-09-01 at 17:55 -0700, Jeremy Fitzhardinge wrote:
 From: Jeremy Fitzhardinge jeremy.fitzhardi...@citrix.com
 
 We need to make sure interrupts are disabled while we're relying on the
 contents of the per-cpu lock_waiting values, otherwise an interrupt
 handler could come in, try to take some other lock, block, and overwrite
 our values.

Would this make it illegal to take a spinlock from NMI context?

I know that its generally considered bad form, but there's at least one
spinlock that's only taken from NMI context and thus hasn't got any
deadlock potential.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Jeremy Fitzhardinge
On 09/02/2011 07:47 AM, Peter Zijlstra wrote:
 On Thu, 2011-09-01 at 17:55 -0700, Jeremy Fitzhardinge wrote:
 From: Jeremy Fitzhardinge jeremy.fitzhardi...@citrix.com

 We need to make sure interrupts are disabled while we're relying on the
 contents of the per-cpu lock_waiting values, otherwise an interrupt
 handler could come in, try to take some other lock, block, and overwrite
 our values.
 Would this make it illegal to take a spinlock from NMI context?

That would be problematic.  But a Xen domain wouldn't be getting NMIs -
at least not standard x86 ones - so that's moot.

 I know that its generally considered bad form, but there's at least one
 spinlock that's only taken from NMI context and thus hasn't got any
 deadlock potential.

Which one?

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Peter Zijlstra
On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
 
  I know that its generally considered bad form, but there's at least one
  spinlock that's only taken from NMI context and thus hasn't got any
  deadlock potential.
 
 Which one? 

arch/x86/kernel/traps.c:nmi_reason_lock

It serializes NMI access to the NMI reason port across CPUs.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Jeremy Fitzhardinge
On 09/02/2011 01:47 PM, Peter Zijlstra wrote:
 On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
 I know that its generally considered bad form, but there's at least one
 spinlock that's only taken from NMI context and thus hasn't got any
 deadlock potential.
 Which one? 
 arch/x86/kernel/traps.c:nmi_reason_lock

 It serializes NMI access to the NMI reason port across CPUs.

Ah, OK.  Well, that will never happen in a PV Xen guest.  But PV
ticketlocks are equally applicable to an HVM Xen domain (and KVM guest),
so I guess there's at least some chance there could be a virtual
emulated NMI.  Maybe?  Does qemu do that kind of thing?

But, erm, does that even make sense?  I'm assuming the NMI reason port
tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
there's only a single reason port, then doesn't that mean that either 1)
they all got the NMI for the same reason, or 2) having a single port is
inherently racy?  How does the locking actually work there?

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Peter Zijlstra
On Fri, 2011-09-02 at 14:50 -0700, Jeremy Fitzhardinge wrote:
 On 09/02/2011 01:47 PM, Peter Zijlstra wrote:
  On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote:
  I know that its generally considered bad form, but there's at least one
  spinlock that's only taken from NMI context and thus hasn't got any
  deadlock potential.
  Which one? 
  arch/x86/kernel/traps.c:nmi_reason_lock
 
  It serializes NMI access to the NMI reason port across CPUs.
 
 Ah, OK.  Well, that will never happen in a PV Xen guest.  But PV
 ticketlocks are equally applicable to an HVM Xen domain (and KVM guest),
 so I guess there's at least some chance there could be a virtual
 emulated NMI.  Maybe?  Does qemu do that kind of thing?

Afaik qemu/kvm can do the whole NMI thing, yes.

 But, erm, does that even make sense?  I'm assuming the NMI reason port
 tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
 there's only a single reason port, then doesn't that mean that either 1)
 they all got the NMI for the same reason, or 2) having a single port is
 inherently racy?  How does the locking actually work there?

I really wouldn't know, the whole NMI thing is a bit of a trainwreck
architecturally. Maybe the x86 maintainers or Linus knows more on this
aspect of it.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Andi Kleen

 But, erm, does that even make sense?  I'm assuming the NMI reason port
 tells the CPU why it got an NMI.  If multiple CPUs can get NMIs and
 there's only a single reason port, then doesn't that mean that either 1)
 they all got the NMI for the same reason, or 2) having a single port is
 inherently racy?  How does the locking actually work there?

All the code to determine NMI reasons is inherently racy,
and each new NMI user makes it worse.

-Andi

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-01 Thread Jeremy Fitzhardinge
From: Jeremy Fitzhardinge jeremy.fitzhardi...@citrix.com

We need to make sure interrupts are disabled while we're relying on the
contents of the per-cpu lock_waiting values, otherwise an interrupt
handler could come in, try to take some other lock, block, and overwrite
our values.

Signed-off-by: Jeremy Fitzhardinge jeremy.fitzhardi...@citrix.com
---
 arch/x86/xen/spinlock.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 4c46144..2ed5d05 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -98,6 +98,7 @@ static void xen_lock_spinning(struct arch_spinlock *lock, 
unsigned want)
struct xen_lock_waiting *w = __get_cpu_var(lock_waiting);
int cpu = smp_processor_id();
u64 start;
+   unsigned long flags;
 
/* If kicker interrupts not initialized yet, just spin */
if (irq == -1)
@@ -105,6 +106,10 @@ static void xen_lock_spinning(struct arch_spinlock *lock, 
unsigned want)
 
start = spin_time_start();
 
+   /* Make sure interrupts are disabled to ensure that these
+  per-cpu values are not overwritten. */
+   local_irq_save(flags);
+
w-want = want;
w-lock = lock;
 
@@ -139,6 +144,9 @@ static void xen_lock_spinning(struct arch_spinlock *lock, 
unsigned want)
 out:
cpumask_clear_cpu(cpu, waiting_cpus);
w-lock = NULL;
+
+   local_irq_restore(flags);
+
spin_time_accum_blocked(start);
 }
 PV_CALLEE_SAVE_REGS_THUNK(xen_lock_spinning);
-- 
1.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html