On Sat, Feb 23, 2013 at 02:05:13PM -0300, Marcelo Tosatti wrote:
> On Sat, Feb 23, 2013 at 05:31:44PM +0200, Gleb Natapov wrote:
> > On Sat, Feb 23, 2013 at 11:48:54AM -0300, Marcelo Tosatti wrote:
> > > > > > 1. orig_irr = read irr from vapic page
> > > > > > 2. if (orig_irr != 0)
> > > > > > 3.  return false;
> > > > > > 4. kvm_make_request(KVM_REQ_EVENT)
> > > > > > 5. bool injected = !test_and_set_bit(PIR)
> > > > > > 6. if (vcpu->guest_mode && injected)
> > > > > > 7.  if (test_and_set_bit(PIR notification bit))
> > > > > > 8.          send PIR IPI
> > > > > > 9. return injected
> > > > > 
> > > > > Consider follow case:
> > > > > vcpu 0                      |                vcpu1
> > > > > send intr to vcpu1
> > > > > check irr
> > > > >                                             receive a posted intr
> > > > >                                                                       
> > > > >         pir->irr(pir is cleared, irr is set)
> > > > > injected=test_and_set_bit=true
> > > > > pir=set
> > > > > 
> > > > > Then both irr and pir have the interrupt pending, they may merge to 
> > > > > one later, but injected reported as true. Wrong.
> > > > > 
> > > > I and Marcelo discussed the lockless logic that should be used here on
> > > > the previous patch submission. All is left for you is to implement it.
> > > > We worked hard to make irq injection path lockless, we will not going to
> > > > add one now.
> > > 
> > > He is right, the scheme is still flawed (because of concurrent injectors
> > > along with HW in VMX non-root). I'd said lets add a spinlock think about
> > > lockless scheme in the meantime.
> > The logic proposed was (from that thread):
> >  apic_accept_interrupt() {
> >   if (PIR || IRR)
> >     return coalesced;
> >   else
> >     set PIR;
> >  }
> > 
> > Which should map to something like:
> > if (!test_and_set_bit(PIR))
> >     return coalesced;
> 
> HW transfers PIR to IRR, here. Say due to PIR IPI sent 
> due to setting of a different vector.
> 
Hmm, yes. Haven't thought about different vector :(

> > if (irr on vapic page is set)
> >         return coalesced;
> 
> > 
> > I do not see how the race above can happen this way. Other can though if
> > vcpu is outside a guest. My be we should deliver interrupt differently
> > depending on whether vcpu is in guest or not.
> 
> Problem is with 3 contexes: two injectors and one vcpu in guest
> mode. Earlier on that thread you mentioned
> 
> "The point is that we need to check PIR and IRR atomically and this is
> impossible."
> 
> That would be one way to fix it.
> 
I do not think it fixes it. There is no guaranty that IPI will be
processed by remote cpu while sending cpu is still in locked section, so
the same race may happen regardless. As you say above there are 3
contexts, but only two use locks.

> > I'd rather think about proper way to do lockless injection before
> > committing anything. The case where we care about correct injection
> > status is rare, but we always care about scalability and since we
> > violate the spec by reading vapic page while vcpu is in non-root
> > operation anyway the result may be incorrect with or without the lock.
> > Our use can was not in HW designers mind when they designed this thing
> > obviously :(
> 
> Zhang, can you comment on whether reading vapic page with CPU in
> VMX-non root accessing it is safe?
> 
> Gleb, yes, a comment mentioning the race (instead of the spinlock) and
> explanation why its believed to be benign (given how the injection
> return value is interpreted) could also work. Its ugly, though... murphy
> is around.
The race above is not benign. It will report interrupt as coalesced
while in reality it is injected. This may cause to many interrupt to be
injected. If this happens rare enough ntp may be able to fix time drift
resulted from this.

> 
> OTOH spinlock is not the end of the world, can figure out something later 
> (we've tried without success so far).
It serializes all injections into vcpu. I do not believe now that even
with lock we are safe for the reason I mention above. We can use pir->on
bit as a lock, but that only emphasise how ridiculous serialization of
injections becomes.

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to