Re: [PATCH 1/6] x86, nmi: Implement delayed irq_work mechanism to handle lost NMIs

Peter Zijlstra Wed, 21 May 2014 12:39:37 -0700

On Wed, May 21, 2014 at 03:02:25PM -0400, Don Zickus wrote:
> On Wed, May 21, 2014 at 07:51:49PM +0200, Peter Zijlstra wrote:
> > On Wed, May 21, 2014 at 12:45:25PM -0400, Don Zickus wrote:
> > > > > +     /*
> > > > > +      * Can't use send_IPI_self here because it will
> > > > > +      * send an NMI in IRQ context which is not what
> > > > > +      * we want.  Create a cpumask for local cpu and
> > > > > +      * force an IPI the normal way (not the shortcut).
> > > > > +      */
> > > > > +     bitmap_zero(nmi_mask, NR_CPUS);
> > > > > +     mask = to_cpumask(nmi_mask);
> > > > > +     cpu_set(smp_processor_id(), *mask);
> > > > > +
> > > > > +     __this_cpu_xchg(nmi_delayed_work_pending, true);
> > > > 
> > > > Why is this xchg and not __this_cpu_write() ?
> > > > 
> > > > > +     apic->send_IPI_mask(to_cpumask(nmi_mask), NMI_VECTOR);
> > > > 
> > > > What's wrong with apic->send_IPI_self(NMI_VECTOR); ?
> > > 
> > > I tried to explain that in my comment above.  IPI_self uses the shortcut
> > > method to send IPIs which means the NMI_VECTOR will be delivered in IRQ
> > > context _not_ NMI context. :-(  This is why I do the whole silly dance.
> > 
> > I'm still not getting it, probably because I don't know how these APICs
> > really work, but the way I read both the comment and your explanation
> > here is that we get an NMI nested in the IRQ context we called it from,
> > which is pretty much exactly what we want.
> 
> Um, ok.  I think my concern with that is an NMI nested in IRQ context
> could be interrupted by a real NMI. I believe that would cause nmi_enter()
> to barf among other bad things in the nmi code.


Ohh, you mean the NMI handler will run as a regular interrupt? Yes, that
would be bad.

> > > So both my problems center around what guarantees does irq_work have to
> > > stay on the same cpu?
> > 
> > Well, none as you used a global irq_work, so all cpus will now contend
> > on it on every NMI trying to queue it :-(
> 
> Yes, I was stuck between using a per-cpu implementation in which every dummy
> NMI grabs the spin lock in the nmi handlers, or a global lock.  I tried
> the global lock.
> 
> I thought the irq_work lock seemed less contended because it was only read
> once before being acted upon (for a cacheline seperate from actual nmi work).
> 
> Whereas a spin lock in the nmi handlers seems to keep reading the lock
> until it owns it thus slowing down useful work for the handler that owns
> the lock (because of the cache contention).
> 
> I could be wrong though.

Well, pretty much every NMI will call irq_queue_work() which calls
irq_work_claim() which does an uncondition cmpxchg (locked rmw) on the
global cacheline.

Which is *hurt*.


will try and reply to the rest later..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] x86, nmi: Implement delayed irq_work mechanism to handle lost NMIs

Reply via email to