Re: [Qemu-devel] [RFC v3 05/10] vfio: Add initial IRQ support in platform device

2014-06-26 Thread Eric Auger
On 06/25/2014 11:40 PM, Alex Williamson wrote:
 On Wed, 2014-06-25 at 23:28 +0200, Alexander Graf wrote:
 On 02.06.14 09:49, Eric Auger wrote:
 This patch brings a first support for device IRQ assignment to a
 KVM guest. Code is inspired of PCI INTx code.

 General principle of IRQ handling:

 when a physical IRQ occurs, VFIO driver signals an eventfd that was
 registered by the QEMU VFIO platform device. The eventfd handler
 (vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also
 disables MMIO region fast path (where MMIO regions are mapped as
 RAM). The purpose is to trap the IRQ status register guest reset.
 The physical interrupt is unmasked on the first read/write in any
 MMIO region. It was masked in the VFIO driver at the instant it
 signaled the eventfd.

 This doesn't sound like a very promising generic scheme to me. I can 
 easily see devices requiring 2 or 3 or more accesses until they're 
 pulling down the IRQ line. During that time interrupts will keep firing, 
 queue up in the irqfd and get at us as spurious interrupts.

 Can't we handle it like PCI where we require devices to not share an 
 interrupt line? Then we can just wait until the EOI in the interrupt 
 controller.
Hi Alex,

Actually I transposed what was done for PCI INTx. For sure the virtual
IRQ completion instant is not precise but as Alex says latter on irqfd
should be used whenever possible for both precision aspects and
performance. Given the perf of this legacy solution for IRQ intensive IP
I would discourage to use that mode anyway. This is why I did not plan
to invest more on this mode.
 
 QEMU's interrupt abstraction makes this really difficult and something
 that's not generally necessary outside of device assignment.  I spent a
 long time trying to figure out how we'd do it for PCI before I came up
 with this super generic hack that works surprisingly well.  Yes, we may
 get additional spurious interrupts, but from a host perspective they're
 rate limited by the guest poking hardware, so there's a feedback loop.
 Also note that assuming this is the same approach we take for PCI, this
 mode is only used for the non-KVM accelerated path.
Yes this is again exactly the same approach as for PCI. We now have full
irqfd + resamplefd support.

Best Regards

Eric
  When we have a KVM
 irqchip that supports a resampling irqfd then we can get an eventfd
 signal back at the point when we should unmask the interrupt on the
 host.  Creating a cross-architecture QEMU interface to give you a
 callback when the architecture's notion of a resampling event occurs is
 not a trivial undertaking.  Thanks,
 
 Alex
 




Re: [Qemu-devel] [RFC v3 05/10] vfio: Add initial IRQ support in platform device

2014-06-25 Thread Alexander Graf


On 02.06.14 09:49, Eric Auger wrote:

This patch brings a first support for device IRQ assignment to a
KVM guest. Code is inspired of PCI INTx code.

General principle of IRQ handling:

when a physical IRQ occurs, VFIO driver signals an eventfd that was
registered by the QEMU VFIO platform device. The eventfd handler
(vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also
disables MMIO region fast path (where MMIO regions are mapped as
RAM). The purpose is to trap the IRQ status register guest reset.
The physical interrupt is unmasked on the first read/write in any
MMIO region. It was masked in the VFIO driver at the instant it
signaled the eventfd.


This doesn't sound like a very promising generic scheme to me. I can 
easily see devices requiring 2 or 3 or more accesses until they're 
pulling down the IRQ line. During that time interrupts will keep firing, 
queue up in the irqfd and get at us as spurious interrupts.


Can't we handle it like PCI where we require devices to not share an 
interrupt line? Then we can just wait until the EOI in the interrupt 
controller.



Alex



A single IRQ can be forwarded to the guest at a time, ie. before a
new virtual IRQ to be injected, the previous active one must have
completed.

When no IRQ is pending anymore, fast path can be restored. This is
done on mmap_timer scheduling.

irqfd support will be added in a subsequent patch. irqfd brings a
framework where the eventfd is handled on kernel side instead of in
user-side as currently done, hence improving the performance.

Although the code is prepared to support multiple IRQs, this is not
tested at that stage.

Tested on Calxeda Midway xgmac which can be directly assigned to one
guest (unfortunately only the main IRQ is exercised). A KVM patch is
required to invalidate stage2 entries on RAM memory region destruction
(https://patches.linaro.org/27691/). Without that patch, slow/fast path
switch cannot work.

change v2 - v3:

- Move mmap_timer and mmap_timeout in new VFIODevice struct as
   PCI/platform factorization.
- multiple IRQ handling (a pending IRQ queue is added) - not tested -
- create vfio_mmap_set_enabled as in PCI code
- name of irq changed in virt

Signed-off-by: Eric Auger eric.au...@linaro.org





Re: [Qemu-devel] [RFC v3 05/10] vfio: Add initial IRQ support in platform device

2014-06-25 Thread Alex Williamson
On Wed, 2014-06-25 at 23:28 +0200, Alexander Graf wrote:
 On 02.06.14 09:49, Eric Auger wrote:
  This patch brings a first support for device IRQ assignment to a
  KVM guest. Code is inspired of PCI INTx code.
 
  General principle of IRQ handling:
 
  when a physical IRQ occurs, VFIO driver signals an eventfd that was
  registered by the QEMU VFIO platform device. The eventfd handler
  (vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also
  disables MMIO region fast path (where MMIO regions are mapped as
  RAM). The purpose is to trap the IRQ status register guest reset.
  The physical interrupt is unmasked on the first read/write in any
  MMIO region. It was masked in the VFIO driver at the instant it
  signaled the eventfd.
 
 This doesn't sound like a very promising generic scheme to me. I can 
 easily see devices requiring 2 or 3 or more accesses until they're 
 pulling down the IRQ line. During that time interrupts will keep firing, 
 queue up in the irqfd and get at us as spurious interrupts.
 
 Can't we handle it like PCI where we require devices to not share an 
 interrupt line? Then we can just wait until the EOI in the interrupt 
 controller.

QEMU's interrupt abstraction makes this really difficult and something
that's not generally necessary outside of device assignment.  I spent a
long time trying to figure out how we'd do it for PCI before I came up
with this super generic hack that works surprisingly well.  Yes, we may
get additional spurious interrupts, but from a host perspective they're
rate limited by the guest poking hardware, so there's a feedback loop.
Also note that assuming this is the same approach we take for PCI, this
mode is only used for the non-KVM accelerated path.  When we have a KVM
irqchip that supports a resampling irqfd then we can get an eventfd
signal back at the point when we should unmask the interrupt on the
host.  Creating a cross-architecture QEMU interface to give you a
callback when the architecture's notion of a resampling event occurs is
not a trivial undertaking.  Thanks,

Alex