Ben-Ami Yassour wrote:
> On Fri, 2008-06-20 at 14:23 +0800, Han, Weidong wrote:
>> Ben-Ami Yassour1 wrote:
>>> "Han, Weidong" <[EMAIL PROTECTED]> wrote on 19/06/2008 17:18:00:
>>>
>>>> Ben-Ami Yassour wrote:
>>>>> On Thu, 2008-06-19 at 16:59 +0800, Han, Weidong wrote:
>>>>>> [EMAIL PROTECTED] wrote:
>>>>>>> From: Ben-Ami Yassour <[EMAIL PROTECTED]>
>>>>>>>
>>>>>>> When changing the VT-d context mapping, according to the spec,
>>>>>>> it is required to first set the context to not present, flush
>>>>>>> and only then apply the new context.
>>>>>>>
>>>>>>> Signed-off-by: Ben-Ami Yassour <[EMAIL PROTECTED]>
>>>>>>> ---
>>>>>>> drivers/pci/intel-iommu.c | 17 +++++++++++++++++
>>>>>>> 1 files changed, 17 insertions(+), 0 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/pci/intel-iommu.c
>>>>>>> b/drivers/pci/intel-iommu.c index 930874f..dcdfa97 100644 ---
>>>>>>> a/drivers/pci/intel-iommu.c +++ b/drivers/pci/intel-iommu.c
>>>>>>> @@ -56,6 +56,7 @@
>>>>>>>
>>>>>>>
>>>>>>> static void flush_unmaps_timeout(unsigned long data);
>>>>>>> +static void detach_domain_for_dev(struct dmar_domain *domain,
>>>>>>> u8 bus, u8 devfn);
>>>>>>>
>>>>>>> DEFINE_TIMER(unmap_timer, flush_unmaps_timeout, 0, 0);
>>>>>>>
>>>>>>> @@ -1264,7 +1265,23 @@ static int
>>>>>>> domain_context_mapping_one(struct dmar_domain *domain, if
>>>>>>> (!context) return -ENOMEM;
>>>>>>> spin_lock_irqsave(&iommu->lock, flags);
>>>>>>> +
>>>>>>> + if (context_present(*context) &&
>>>>>>> + (context_domain_id(*context) == domain->id) &&
>>>>>>> + (context_address_width(*context) == domain->agaw) &&
>>>>>>> + (context_address_root(*context) ==
>>>>>>> virt_to_phys(domain->pgd)) && +
>>>>>>> (context_translation_type(*context) == CONTEXT_TT_MULTI_LEVEL)
>>>>>>> && + (!context_fault_disable(*context))) {
>>>>>>> + spin_unlock_irqrestore(&iommu->lock, flags); +
>>>>>>> return 0; + }
>>>>>>
>>>>>> Only need to check context_present(*context) according to VT-d
>>>>>> spec, which says "software must not modify fields other than the
>>>>>> Present (P) field of currently present root-entries or
>>>>>> context-entries."
>>>>>>
>>>>>> Randy (Weidong)
>>>>>
>>>>> The logic here is that, if no change is made to the context then
>>>>> just return ok (0). Otherwise, according to the spec, we need to
>>>>> first invalidate the context, flush it, and only then apply the
>>>>> changes to the context.
>>>>>
>>>>> The other option is to make sure that before every call to this
>>>>> function the context is invalidated, but disabling it inside the
>>>>> function seems safer. do you agree with that?
>>>>>
>>>>
>>>> After a device can be assigned to guest with VT-d, it needs a
>>>> context unmap function. When a device is assigned to a guest, map
>>>> context for it, while when it is detached from a guest, should
>>>> unmap its context. With the context unmap function, I think we
>>>> needn't to implement your logic in domain_context_mapping_one(),
>>>> instead just check its present. What's your opinion?
>>>
>>> Sure, that's fine, this is the other option I mentioned.
>>> But we need to add the context unmap function.
>>> Something like:
>>> diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
>>> index be775cd..4b96fbb 100644
>>> --- a/arch/x86/kvm/vtd.c
>>> +++ b/arch/x86/kvm/vtd.c
>>> @@ -109,11 +109,17 @@ found:
>>> kvm_iommu_unmap_memslots(kvm);
>>> return -EFAULT;
>>> }
>>> +
>>> + kvm_intel_iommu_detach_dev(kvm->arch.domain,
>>> + pdev->bus->number, pdev->devfn);
>>> + kvm_intel_iommu_context_mapping(kvm->arch.domain, pdev);
>>> return 0; }
>>> EXPORT_SYMBOL_GPL(kvm_iommu_map_guest);
>>>
>>> Agree?
>>>
>>
>> I don't think it's necessary to add kvm_intel_iommu_detach_dev()
>> here. If a device can be assigned to a guest, it should not be used
>> by other guest (assuming no hotplug support). And also
>> kvm_intel_iommu_detach_dev() is already called in
>> kvm_iommu_unmap_guest(). Normally context won't be present here.
>> Otherwise, there should be some wrong. I attach a patch to change it
>> back to original kernel VT-d code. I think it's correct and clean.
>
> The problem is with the following scenario:
> 1. load the host NIC driver
> 2. unload the host NIC driver
> 3. start kvm with passthrough for that NIC
>
> In this case the context is not cleaned, and we get VT-d failures.
> I agree with changing the VT-d driver code back to original.
> But we do need:
>
> diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
> index be775cd..4b96fbb 100644
> --- a/arch/x86/kvm/vtd.c
> +++ b/arch/x86/kvm/vtd.c
> @@ -109,11 +109,17 @@ found:
> kvm_iommu_unmap_memslots(kvm);
> return -EFAULT;
> }
> +
> + kvm_intel_iommu_detach_dev(kvm->arch.domain,
> + pdev->bus->number, pdev->devfn);
> +
> kvm_intel_iommu_context_mapping(kvm->arch.domain, pdev);
> return 0;
> }
> EXPORT_SYMBOL_GPL(kvm_iommu_map_guest);
>
> Otherwise kvm_intel_iommu_context_mapping will exit on the
> context_present(*context) check.
> This means that the context is still going to point to the iova of
> the host NIC.
> If you do not agree, please explain what code path clears the context
> written for the host NIC driver in this scenario (note that
> empirically I see that my assumption is correct).
>
Yes, in this load/unload case, kvm_intel_iommu_detach_dev() is necessary,
because the device will be mapped in host VT-d page table when its driver is
loaded, but it is not unmapped when the driver is unloaded.
Randy (Weidong)