On Sun, Jul 11, 2010 at 02:03:34PM -0600, Alex Williamson wrote:
> On Sun, 2010-07-11 at 22:23 +0300, Michael S. Tsirkin wrote:
> > On Sun, Jul 11, 2010 at 01:21:18PM -0600, Alex Williamson wrote:
> > > On Sun, 2010-07-11 at 21:54 +0300, Michael S. Tsirkin wrote:
> > > > On Sun, Jul 11, 2010 at 09:30:59PM +0300, Avi Kivity wrote:
> > > > > On 07/11/2010 09:26 PM, Alex Williamson wrote:
> > > > > >On Sun, 2010-07-11 at 21:14 +0300, Avi Kivity wrote:
> > > > > >>On 07/11/2010 09:09 PM, Alex Williamson wrote:
> > > > > >>>For device assignment, we need to know when the VM writes an end
> > > > > >>>of interrupt to the APIC, which allows us to de-assert the
> > > > > >>>interrupt
> > > > > >>>line and clear the DisINTx bit. Add a new wrapper for ioapic
> > > > > >>>generated interrupts with a callback on eoi and create an interface
> > > > > >>>for drivers to be notified on eoi.
> > > > > >>>
> > > > > >>You aren't going to get this with kvm's in-kernel irqchip, so we
> > > > > >>need a
> > > > > >>new interface there.
> > > > > >Registering an eventfd for the eoi seems like a reasonable
> > > > > >alternative.
> > > > >
> > > > > I'm worried about that racing (with what?)
> > > >
> > > > With device asserting the interrupt?
> > > > Need to make sure that all possible scenarious work well:
> > > >
> > > > device asserts interrupt
> > > > driver clears interrupt
> > > > device asserts interrupt
> > > > eoi
> > > >
> > > > device asserts interrupt
> > > > driver clears interrupt
> > > > eoi
> > > > device asserts interrupt
> > > >
> > > > etc
> > > >
> > > > Not that I see issues, these are things we need to check.
> > >
> > > I think those are all protected by host and qemu vfio drivers managing
> > > DisINTx. The way I understand it to work now is:
> > >
> > > device asserts interrupt
> > > interrupt lands in host vfio driver
> > > host vfio sets DisINTx on the device
> > > host vfio sends eventfd
> > > eventfd lands in qemu vfio, does a qemu_set_irq
> > > ... guest processes
> > > guest writes eoi to apic, lands back in qemu vfio driver
> > > qemu vfio deasserts qemu interrupt
> > > qemu vfio clears DisINTx
> > >
> > > So I don't think there's a race as long as ordering is sane for toggling
> > > DisINTx. Thanks,
> > >
> > > Alex
> > >
> >
> > What about threaded interrupts? I think (correct me if I am wrong)
> > that they work like this:
> >
> > device asserts interrupt
> > guest disables interrupt
>
> Is this the guest manipulating DisINTx itself? I suppose it could be a
> device dependent disable as well.
It can manipulate it, so we need to virtualize it, but that's a
separate issue.
> > eoi
> > guest enables interrupt
> > driver clears interrupt
>
> These two are hopefully reversed or else the driver is expecting to
> clear and potentially reassert interrupts anyway.
Yes. Sorry.
> > device asserts interrupt
> >
> > If so, your code will clear DisINTx immediately which
> > will always get us another host interrupt:
> > performance will be hurt. I am also not sure
> > we'll not lose interrupts.
>
> Level interrupts are lossy afaik, if it gets cleared but an interrupt
> condition still exists, it should be reasserted.
Yes but I mean we won't interrupt the guest. So it wil lstay disabled
forever.
> > It seems we need to track interrupt disable/enable as well, and only
> > clear DisINTx after eoi with interrupts enabled. Not sure what is the
> > interface for this.
>
> If a driver uses device dependent code to disable interrupts,
> there's no
> issue, we'll clear DisINTx, but the device still won't generate an
> interrupt until the dependent code is re-enabled by the guest (assuming
> there's no cross talk between DisINTx and device dependent components).
>
> For the case that a guest driver disables via DisINTx, it seems easy to
> trap and track that. So we get:
>
> device asserts interrupt
> guest disables interrupt
> (trapped, qemu-vfio sets intx.guest_disabled = 1)
> eoi
> (qemu-vfio deasserts qemu interrupts, but because of above doesn't
> clear DisINTx)
> guest enables interrupt
> (allowed to pass through, intx.guest_disabled = 0)
> driver clears interrupt
> device asserts interrupt
>
> I've already got an intx.pending bit, so I think this just changes the eoi to:
>
> vdev->intx.pending = 0;
> qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0);
> if (!vdev->intx.guest_disabled) {
> vfio_unmask_intx(vdev);
> }
>
> Writing the command register DisINTx bit then just gets some kind of:
>
> if (cmd & PCI_COMMAND_INTX_DISABLE && intx.pending) {
> intx.guest_disabled = 1;
> cmd &= ~PCI_COMMAND_INTX_DISABLE;
> } else if (!(cmd & PCI_COMMAND_INTX_DISABLE) && intx.guest_disabled) {
> intx.guest_disabled = 0;
> }
> ... allow write
>
> That work? Thanks,
>
> Alex
No, I mean guest OS disables the specific interrupt with
disable_irq.
--
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html