On Fri, 2008-06-20 at 14:23 +0800, Han, Weidong wrote:
> Ben-Ami Yassour1 wrote:
> > "Han, Weidong" <[EMAIL PROTECTED]> wrote on 19/06/2008 17:18:00:
> >
> >> Ben-Ami Yassour wrote:
> >>> On Thu, 2008-06-19 at 16:59 +0800, Han, Weidong wrote:
> >>>> [EMAIL PROTECTED] wrote:
> >>>>> From: Ben-Ami Yassour <[EMAIL PROTECTED]>
> >>>>>
> >>>>> When changing the VT-d context mapping, according to the spec, it
> >>>>> is required to first set the context to not present, flush and
> >>>>> only then apply the new context.
> >>>>>
> >>>>> Signed-off-by: Ben-Ami Yassour <[EMAIL PROTECTED]>
> >>>>> ---
> >>>>> drivers/pci/intel-iommu.c | 17 +++++++++++++++++
> >>>>> 1 files changed, 17 insertions(+), 0 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
> >>>>> index 930874f..dcdfa97 100644 --- a/drivers/pci/intel-iommu.c
> >>>>> +++ b/drivers/pci/intel-iommu.c
> >>>>> @@ -56,6 +56,7 @@
> >>>>>
> >>>>>
> >>>>> static void flush_unmaps_timeout(unsigned long data);
> >>>>> +static void detach_domain_for_dev(struct dmar_domain *domain, u8
> >>>>> bus, u8 devfn);
> >>>>>
> >>>>> DEFINE_TIMER(unmap_timer, flush_unmaps_timeout, 0, 0);
> >>>>>
> >>>>> @@ -1264,7 +1265,23 @@ static int
> >>>>> domain_context_mapping_one(struct dmar_domain *domain, if
> >>>>> (!context) return -ENOMEM;
> >>>>> spin_lock_irqsave(&iommu->lock, flags);
> >>>>> +
> >>>>> + if (context_present(*context) &&
> >>>>> + (context_domain_id(*context) == domain->id) &&
> >>>>> + (context_address_width(*context) == domain->agaw) &&
> >>>>> + (context_address_root(*context) ==
> >>>>> virt_to_phys(domain->pgd)) && +
> >>>>> (context_translation_type(*context) == CONTEXT_TT_MULTI_LEVEL) &&
> >>>>> + (!context_fault_disable(*context))) {
> >>>>> + spin_unlock_irqrestore(&iommu->lock, flags); + return
> >>>>> 0; + }
> >>>>
> >>>> Only need to check context_present(*context) according to VT-d
> >>>> spec, which says "software must not modify fields other than the
> >>>> Present (P) field of currently present root-entries or
> >>>> context-entries."
> >>>>
> >>>> Randy (Weidong)
> >>>
> >>> The logic here is that, if no change is made to the context then
> >>> just return ok (0). Otherwise, according to the spec, we need to
> >>> first invalidate the context, flush it, and only then apply the
> >>> changes to the context.
> >>>
> >>> The other option is to make sure that before every call to this
> >>> function the context is invalidated, but disabling it inside the
> >>> function seems safer. do you agree with that?
> >>>
> >>
> >> After a device can be assigned to guest with VT-d, it needs a
> >> context unmap function. When a device is assigned to a guest, map
> >> context for it, while when it is detached from a guest, should unmap
> >> its context. With the context unmap function, I think we needn't to
> >> implement your logic in domain_context_mapping_one(), instead just
> >> check its present. What's your opinion?
> >
> > Sure, that's fine, this is the other option I mentioned.
> > But we need to add the context unmap function.
> > Something like:
> > diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
> > index be775cd..4b96fbb 100644
> > --- a/arch/x86/kvm/vtd.c
> > +++ b/arch/x86/kvm/vtd.c
> > @@ -109,11 +109,17 @@ found:
> > kvm_iommu_unmap_memslots(kvm);
> > return -EFAULT;
> > }
> > +
> > + kvm_intel_iommu_detach_dev(kvm->arch.domain,
> > + pdev->bus->number, pdev->devfn);
> > +
> > kvm_intel_iommu_context_mapping(kvm->arch.domain, pdev);
> > return 0;
> > }
> > EXPORT_SYMBOL_GPL(kvm_iommu_map_guest);
> >
> > Agree?
> >
>
> I don't think it's necessary to add kvm_intel_iommu_detach_dev() here.
> If a device can be assigned to a guest, it should not be used by other
> guest (assuming no hotplug support). And also
> kvm_intel_iommu_detach_dev() is already called in
> kvm_iommu_unmap_guest(). Normally context won't be present here.
> Otherwise, there should be some wrong. I attach a patch to change it
> back to original kernel VT-d code. I think it's correct and clean.
The problem is with the following scenario:
1. load the host NIC driver
2. unload the host NIC driver
3. start kvm with passthrough for that NIC
In this case the context is not cleaned, and we get VT-d failures.
I agree with changing the VT-d driver code back to original.
But we do need:
diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
index be775cd..4b96fbb 100644
--- a/arch/x86/kvm/vtd.c
+++ b/arch/x86/kvm/vtd.c
@@ -109,11 +109,17 @@ found:
kvm_iommu_unmap_memslots(kvm);
return -EFAULT;
}
+
+ kvm_intel_iommu_detach_dev(kvm->arch.domain,
+ pdev->bus->number, pdev->devfn);
+
kvm_intel_iommu_context_mapping(kvm->arch.domain, pdev);
return 0;
}
EXPORT_SYMBOL_GPL(kvm_iommu_map_guest);
Otherwise kvm_intel_iommu_context_mapping will exit on the
context_present(*context) check.
This means that the context is still going to point to the iova of the host NIC.
If you do not agree, please explain what code path clears the context written
for the host NIC driver in this scenario
(note that empirically I see that my assumption is correct).
Thanks,
Ben
>
> diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
> index 930874f..04517a2 100644
> --- a/drivers/pci/intel-iommu.c
> +++ b/drivers/pci/intel-iommu.c
> @@ -1264,8 +1264,11 @@ static int domain_context_mapping_one(struct
> dmar_domain *domain,
> if (!context)
> return -ENOMEM;
> spin_lock_irqsave(&iommu->lock, flags);
> + if (context_present(*context)) {
> + spin_unlock_irqrestore(&iommu->lock, flags);
> + return 0;
> + }
>
> - context_clear_entry(*context);
> context_set_domain_id(*context, domain->id);
> context_set_address_width(*context, domain->agaw);
> context_set_address_root(*context, virt_to_phys(domain->pgd));
>
>
> Randy (Weidong)
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html