Re: [Qemu-devel] [PATCH v5 00/18] IOMMU: Enable interrupt remapping for Intel IOMMU

Jan Kiszka Mon, 02 May 2016 23:11:31 -0700

On 2016-05-03 08:00, Peter Xu wrote:
> On Tue, May 03, 2016 at 07:40:50AM +0200, Jan Kiszka wrote:
>> On 2016-05-03 07:30, Peter Xu wrote:
>>> On Tue, May 03, 2016 at 06:38:28AM +0200, Jan Kiszka wrote:
>>>> On 2016-05-03 05:22, Peter Xu wrote:
>>>>> On Fri, Apr 29, 2016 at 09:52:14PM +0200, Radim Krčmář wrote:
>>>>>> 2016-04-28 17:18+0800, Peter Xu:
>>>>>>> On Thu, Apr 28, 2016 at 09:19:28AM +0200, Jan Kiszka wrote:
>>>>>>>> Instead of fiddling with irq routes for the IOAPIC - where we don't 
>>>>>>>> need
>>>>>>>> it -, I would suggest to do the following: Send IOAPIC events via
>>>>>>>> kvm_irqchip_send_msi to the kernel. Only irqfd users (vhost, vfio)
>>>>>>>> should use the pattern you are now applying to the IOAPIC: establish
>>>>>>>> static routes as an irqfd is set up, and that route should be 
>>>>>>>> translated
>>>>>>>> by the iommu first, register an IEC notifier to update any affected
>>>>>>>> route when the iommu translation changes.
>>>>>>>
>>>>>>> Yes, maybe that's the right thing to do. Or say, when split irqchip,
>>>>>>> IOAPIC can avoid using GSI routes any more. If with that, I should
>>>>>>> also remove lots of things, like: IEC notifiers for IOAPIC, and all
>>>>>>> things related to msi route sync-up in IOAPIC codes with KVM (so I
>>>>>>> suppose we will save 24 gsi route entries for KVM, which sounds
>>>>>>> good).
>>>>>>
>>>>>> Sadly, we can't get rid of those GSI routes.  KVM uses them to decide
>>>>>> whether it should forward EOI to userspace.  And QEMU also has to remap
>>>>>> them, because KVM uses dest_id for that decision.
>>>>>
>>>>> Thanks to point out.  Actually I have started to test the new patch
>>>>> and did encountered issue when using split irqchip mode. The above
>>>>> explains. :)
>>>>>
>>>>> So I think I will keep this part of codes un-touched in this
>>>>> series, and I can move forward with IR changes.
>>>>>
>>>>> However, I am still wondering whether we can avoid this in KVM when
>>>>> when using split irqchip mode? E.g., what if we do not do this check
>>>>> in KVM and let all EOI be passed to userspace? Like:
>>>>>
>>>>> ---
>>>>>
>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>> index 36591fa..c0a6e51 100644
>>>>> --- a/arch/x86/kvm/lapic.c
>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>> @@ -936,10 +936,6 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic 
>>>>> *apic, int vector)
>>>>>  {
>>>>>         int trigger_mode;
>>>>>
>>>>> -       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
>>>>> -       if (!kvm_ioapic_handles_vector(apic, vector))
>>>>> -               return;
>>>>> -
>>>>>         /* Request a KVM exit to inform the userspace IOAPIC. */
>>>>>         if (irqchip_split(apic->vcpu->kvm)) {
>>>>>                 apic->vcpu->arch.pending_ioapic_eoi = vector;
>>>>> @@ -947,6 +943,10 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic 
>>>>> *apic, int vector)
>>>>>                 return;
>>>>>         }
>>>>>
>>>>> +       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
>>>>> +       if (!kvm_ioapic_handles_vector(apic, vector))
>>>>> +               return;
>>>>> +
>>>>>         if (apic_test_vector(vector, apic->regs + APIC_TMR))
>>>>>                 trigger_mode = IOAPIC_LEVEL_TRIG;
>>>>>         else
>>>>>
>>>>> ---
>>>>>
>>>>> I believe this will brings more user-space context switches, I just
>>>>> do not know how this will impact the system (only for curiosity :).
>>>>
>>>> Yes, this doesn't look good from the performance POV. Given that most
>>>> EOIs of the APICs will not trigger a message to an IOAPIC and userspace
>>>> exits are expensive, that should be measurable.
>>>>
>>>> But you should optimize route updating a bit: I noticed real delays
>>>> (almost a second) when reprogramming all IOAPIC pins during Jailhouse
>>>> handover. That's because of the heavyweight synchronize_srcu we have on
>>>> each route change. Even if that is an extreme case, you should try to
>>>> reduce route updates to only those that were actually affected by an
>>>> IRTE change.
>>>
>>> Yes, this solution will possibly need one more IOMMU API besides
>>> int_remap(), for IOAPIC to know whether specific IOAPIC entry needs
>>> to be updated (IOAPIC needs to know the index information of each
>>> IOAPIC entry, and see whether it is matched with IEC notified index
>>> and mask, in ioapic_iec_notifier()).
>>>
>>> Another problem that may cause this issue is that: now we are
>>> committing route information every time when updating route entries
>>> (please check kvm_update_routing_entry()). IMHO this is not
>>> necessary - we will need 25 times of committing for each IEC
>>> invalication, while actually we only need one at the end.
>>>
>>> So... We'll have another solution for this: how about not committing
>>> for each update, and just do one commit after all changes settled?
>>> This seems to be much cleaner, and easier comparing to the above
>>> proposal.
>>
>> If you see a number of invalidation requests queued in a row, you may
>> also process them atomically from the guest POV, which includes a single
>> route update at the end, before the guest gets told that everything was
>> done.
> 
> Yes, this would be better as a 3rd step after the above two. If you
> would not mind, I'll noting this down as well with the 1st proposal
> (considering that this will need more code changes, not only in QI
> logic, also in IEC notification path), and will only contain the 2nd
> change (explicit route commit) in v6, to partially solve the problem
> for now.


Ack.

Jan

Re: [Qemu-devel] [PATCH v5 00/18] IOMMU: Enable interrupt remapping for Intel IOMMU

Reply via email to