Re: [Qemu-devel] [RFC v3 05/10] vfio: Add initial IRQ support in platform device
On 06/25/2014 11:40 PM, Alex Williamson wrote: On Wed, 2014-06-25 at 23:28 +0200, Alexander Graf wrote: On 02.06.14 09:49, Eric Auger wrote: This patch brings a first support for device IRQ assignment to a KVM guest. Code is inspired of PCI INTx code. General principle of IRQ handling: when a physical IRQ occurs, VFIO driver signals an eventfd that was registered by the QEMU VFIO platform device. The eventfd handler (vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also disables MMIO region fast path (where MMIO regions are mapped as RAM). The purpose is to trap the IRQ status register guest reset. The physical interrupt is unmasked on the first read/write in any MMIO region. It was masked in the VFIO driver at the instant it signaled the eventfd. This doesn't sound like a very promising generic scheme to me. I can easily see devices requiring 2 or 3 or more accesses until they're pulling down the IRQ line. During that time interrupts will keep firing, queue up in the irqfd and get at us as spurious interrupts. Can't we handle it like PCI where we require devices to not share an interrupt line? Then we can just wait until the EOI in the interrupt controller. Hi Alex, Actually I transposed what was done for PCI INTx. For sure the virtual IRQ completion instant is not precise but as Alex says latter on irqfd should be used whenever possible for both precision aspects and performance. Given the perf of this legacy solution for IRQ intensive IP I would discourage to use that mode anyway. This is why I did not plan to invest more on this mode. QEMU's interrupt abstraction makes this really difficult and something that's not generally necessary outside of device assignment. I spent a long time trying to figure out how we'd do it for PCI before I came up with this super generic hack that works surprisingly well. Yes, we may get additional spurious interrupts, but from a host perspective they're rate limited by the guest poking hardware, so there's a feedback loop. Also note that assuming this is the same approach we take for PCI, this mode is only used for the non-KVM accelerated path. Yes this is again exactly the same approach as for PCI. We now have full irqfd + resamplefd support. Best Regards Eric When we have a KVM irqchip that supports a resampling irqfd then we can get an eventfd signal back at the point when we should unmask the interrupt on the host. Creating a cross-architecture QEMU interface to give you a callback when the architecture's notion of a resampling event occurs is not a trivial undertaking. Thanks, Alex
Re: [Qemu-devel] [RFC v3 05/10] vfio: Add initial IRQ support in platform device
On 02.06.14 09:49, Eric Auger wrote: This patch brings a first support for device IRQ assignment to a KVM guest. Code is inspired of PCI INTx code. General principle of IRQ handling: when a physical IRQ occurs, VFIO driver signals an eventfd that was registered by the QEMU VFIO platform device. The eventfd handler (vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also disables MMIO region fast path (where MMIO regions are mapped as RAM). The purpose is to trap the IRQ status register guest reset. The physical interrupt is unmasked on the first read/write in any MMIO region. It was masked in the VFIO driver at the instant it signaled the eventfd. This doesn't sound like a very promising generic scheme to me. I can easily see devices requiring 2 or 3 or more accesses until they're pulling down the IRQ line. During that time interrupts will keep firing, queue up in the irqfd and get at us as spurious interrupts. Can't we handle it like PCI where we require devices to not share an interrupt line? Then we can just wait until the EOI in the interrupt controller. Alex A single IRQ can be forwarded to the guest at a time, ie. before a new virtual IRQ to be injected, the previous active one must have completed. When no IRQ is pending anymore, fast path can be restored. This is done on mmap_timer scheduling. irqfd support will be added in a subsequent patch. irqfd brings a framework where the eventfd is handled on kernel side instead of in user-side as currently done, hence improving the performance. Although the code is prepared to support multiple IRQs, this is not tested at that stage. Tested on Calxeda Midway xgmac which can be directly assigned to one guest (unfortunately only the main IRQ is exercised). A KVM patch is required to invalidate stage2 entries on RAM memory region destruction (https://patches.linaro.org/27691/). Without that patch, slow/fast path switch cannot work. change v2 - v3: - Move mmap_timer and mmap_timeout in new VFIODevice struct as PCI/platform factorization. - multiple IRQ handling (a pending IRQ queue is added) - not tested - - create vfio_mmap_set_enabled as in PCI code - name of irq changed in virt Signed-off-by: Eric Auger eric.au...@linaro.org
Re: [Qemu-devel] [RFC v3 05/10] vfio: Add initial IRQ support in platform device
On Wed, 2014-06-25 at 23:28 +0200, Alexander Graf wrote: On 02.06.14 09:49, Eric Auger wrote: This patch brings a first support for device IRQ assignment to a KVM guest. Code is inspired of PCI INTx code. General principle of IRQ handling: when a physical IRQ occurs, VFIO driver signals an eventfd that was registered by the QEMU VFIO platform device. The eventfd handler (vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also disables MMIO region fast path (where MMIO regions are mapped as RAM). The purpose is to trap the IRQ status register guest reset. The physical interrupt is unmasked on the first read/write in any MMIO region. It was masked in the VFIO driver at the instant it signaled the eventfd. This doesn't sound like a very promising generic scheme to me. I can easily see devices requiring 2 or 3 or more accesses until they're pulling down the IRQ line. During that time interrupts will keep firing, queue up in the irqfd and get at us as spurious interrupts. Can't we handle it like PCI where we require devices to not share an interrupt line? Then we can just wait until the EOI in the interrupt controller. QEMU's interrupt abstraction makes this really difficult and something that's not generally necessary outside of device assignment. I spent a long time trying to figure out how we'd do it for PCI before I came up with this super generic hack that works surprisingly well. Yes, we may get additional spurious interrupts, but from a host perspective they're rate limited by the guest poking hardware, so there's a feedback loop. Also note that assuming this is the same approach we take for PCI, this mode is only used for the non-KVM accelerated path. When we have a KVM irqchip that supports a resampling irqfd then we can get an eventfd signal back at the point when we should unmask the interrupt on the host. Creating a cross-architecture QEMU interface to give you a callback when the architecture's notion of a resampling event occurs is not a trivial undertaking. Thanks, Alex