Re: [Qemu-devel] [PATCH v5 00/18] IOMMU: Enable interrupt remapping for Intel IOMMU

Jan Kiszka Mon, 02 May 2016 22:43:14 -0700

On 2016-05-03 07:30, Peter Xu wrote:
> On Tue, May 03, 2016 at 06:38:28AM +0200, Jan Kiszka wrote:
>> On 2016-05-03 05:22, Peter Xu wrote:
>>> On Fri, Apr 29, 2016 at 09:52:14PM +0200, Radim Krčmář wrote:
>>>> 2016-04-28 17:18+0800, Peter Xu:
>>>>> On Thu, Apr 28, 2016 at 09:19:28AM +0200, Jan Kiszka wrote:
>>>>>> Instead of fiddling with irq routes for the IOAPIC - where we don't need
>>>>>> it -, I would suggest to do the following: Send IOAPIC events via
>>>>>> kvm_irqchip_send_msi to the kernel. Only irqfd users (vhost, vfio)
>>>>>> should use the pattern you are now applying to the IOAPIC: establish
>>>>>> static routes as an irqfd is set up, and that route should be translated
>>>>>> by the iommu first, register an IEC notifier to update any affected
>>>>>> route when the iommu translation changes.
>>>>>
>>>>> Yes, maybe that's the right thing to do. Or say, when split irqchip,
>>>>> IOAPIC can avoid using GSI routes any more. If with that, I should
>>>>> also remove lots of things, like: IEC notifiers for IOAPIC, and all
>>>>> things related to msi route sync-up in IOAPIC codes with KVM (so I
>>>>> suppose we will save 24 gsi route entries for KVM, which sounds
>>>>> good).
>>>>
>>>> Sadly, we can't get rid of those GSI routes.  KVM uses them to decide
>>>> whether it should forward EOI to userspace.  And QEMU also has to remap
>>>> them, because KVM uses dest_id for that decision.
>>>
>>> Thanks to point out.  Actually I have started to test the new patch
>>> and did encountered issue when using split irqchip mode. The above
>>> explains. :)
>>>
>>> So I think I will keep this part of codes un-touched in this
>>> series, and I can move forward with IR changes.
>>>
>>> However, I am still wondering whether we can avoid this in KVM when
>>> when using split irqchip mode? E.g., what if we do not do this check
>>> in KVM and let all EOI be passed to userspace? Like:
>>>
>>> ---
>>>
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 36591fa..c0a6e51 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -936,10 +936,6 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic 
>>> *apic, int vector)
>>>  {
>>>         int trigger_mode;
>>>
>>> -       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
>>> -       if (!kvm_ioapic_handles_vector(apic, vector))
>>> -               return;
>>> -
>>>         /* Request a KVM exit to inform the userspace IOAPIC. */
>>>         if (irqchip_split(apic->vcpu->kvm)) {
>>>                 apic->vcpu->arch.pending_ioapic_eoi = vector;
>>> @@ -947,6 +943,10 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic 
>>> *apic, int vector)
>>>                 return;
>>>         }
>>>
>>> +       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
>>> +       if (!kvm_ioapic_handles_vector(apic, vector))
>>> +               return;
>>> +
>>>         if (apic_test_vector(vector, apic->regs + APIC_TMR))
>>>                 trigger_mode = IOAPIC_LEVEL_TRIG;
>>>         else
>>>
>>> ---
>>>
>>> I believe this will brings more user-space context switches, I just
>>> do not know how this will impact the system (only for curiosity :).
>>
>> Yes, this doesn't look good from the performance POV. Given that most
>> EOIs of the APICs will not trigger a message to an IOAPIC and userspace
>> exits are expensive, that should be measurable.
>>
>> But you should optimize route updating a bit: I noticed real delays
>> (almost a second) when reprogramming all IOAPIC pins during Jailhouse
>> handover. That's because of the heavyweight synchronize_srcu we have on
>> each route change. Even if that is an extreme case, you should try to
>> reduce route updates to only those that were actually affected by an
>> IRTE change.
> 
> Yes, this solution will possibly need one more IOMMU API besides
> int_remap(), for IOAPIC to know whether specific IOAPIC entry needs
> to be updated (IOAPIC needs to know the index information of each
> IOAPIC entry, and see whether it is matched with IEC notified index
> and mask, in ioapic_iec_notifier()).
> 
> Another problem that may cause this issue is that: now we are
> committing route information every time when updating route entries
> (please check kvm_update_routing_entry()). IMHO this is not
> necessary - we will need 25 times of committing for each IEC
> invalication, while actually we only need one at the end.
> 
> So... We'll have another solution for this: how about not committing
> for each update, and just do one commit after all changes settled?
> This seems to be much cleaner, and easier comparing to the above
> proposal.


If you see a number of invalidation requests queued in a row, you may
also process them atomically from the guest POV, which includes a single
route update at the end, before the guest gets told that everything was
done.

Jan

> 
> (These two solutions actually solves different problems, and either
>  of them should help in reducing the 1 sec delay mentioned
>  above. For now, I'd prefer first choose the 2nd solution to do
>  explicit route commit, and add the 1st one into my todo list)
> 
> Thanks,
> 
> -- peterx
>

Re: [Qemu-devel] [PATCH v5 00/18] IOMMU: Enable interrupt remapping for Intel IOMMU

Reply via email to