Re: [PATCH v12 4/8] iommu/vt-d: Add bind guest PASID support

2020-04-29 Thread Jacob Pan
On Wed, 29 Apr 2020 16:12:01 +0200
Auger Eric  wrote:

> >> in last review Eric raised the open about what about binding the
> >> same PASID to the same pdev multiple times. We discussed that
> >> should be disallowed. Here can you check whether aux_domain is
> >> enabled on pdev to restrict multiple-binding only for
> >> sub-devices?  
> > Why aux_domain is sufficient? A pdev could have aux_domain enabled
> > but still bind pdev many times more than its mdevs.
> > 
> > Either we allow multiple bind or not.  
> 
> I tried to figure out whether binding the same PASID to the same pdev
> was meaningful. I understood it is not. If this case can be detected
> at VFIO level I am fine as well.
I will remove the multiple bind support for now. Reintroduce it when we
enable mdev.

Thanks,

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 4/8] iommu/vt-d: Add bind guest PASID support

2020-04-29 Thread Auger Eric
Hi,

On 4/27/20 10:34 PM, Jacob Pan wrote:
> On Fri, 24 Apr 2020 10:47:45 +
> "Tian, Kevin"  wrote:
> 
>>> From: Jacob Pan 
>>> Sent: Wednesday, April 22, 2020 2:53 AM
>>>
>>> When supporting guest SVA with emulated IOMMU, the guest PASID
>>> table is shadowed in VMM. Updates to guest vIOMMU PASID table
>>> will result in PASID cache flush which will be passed down to
>>> the host as bind guest PASID calls.  
>>
>> Above description is not accurate. Guest PASID table updates don't
>> 'result in' PASID cache flush automatically. What about:
>> --
>> The guest needs to invalidate the PASID cache for any update to
>> guest PASID table. Those invalidation requests are intercepted
>> by the VMM and passed down to the host as binding guest PASID
>> calls.
>> --
> It is good to add more details, thanks.
> 
>>>
>>> For the SL page tables, it will be harvested from device's
>>> default domain (request w/o PASID), or aux domain in case of
>>> mediated device.
>>>
>>> .-.  .---.
>>> |   vIOMMU|  | Guest process CR3, FL only|
>>> | |  '---'
>>> ./
>>> | PASID Entry |--- PASID cache flush -
>>> '-'   |
>>> | |   V
>>> | |CR3 in GPA
>>> '-'
>>> Guest
>>> --| Shadow |--|
>>>   vv  v
>>> Host
>>> .-.  .--.
>>> |   pIOMMU|  | Bind FL for GVA-GPA  |
>>> | |  '--'
>>> ./  |
>>> | PASID Entry | V (Nested xlate)
>>> '\.--.
>>> | |   |SL for GPA-HPA, default domain|
>>> | |   '--'
>>> '-'
>>> Where:
>>>  - FL = First level/stage one page tables
>>>  - SL = Second level/stage two page tables
>>>
>>> Signed-off-by: Jacob Pan 
>>> Signed-off-by: Liu, Yi L 
>>> ---
>>>  drivers/iommu/intel-iommu.c |   4 +
>>>  drivers/iommu/intel-svm.c   | 204
>>> 
>>>  include/linux/intel-iommu.h |   8 +-
>>>  include/linux/intel-svm.h   |  17 
>>>  4 files changed, 232 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/intel-iommu.c
>>> b/drivers/iommu/intel-iommu.c index 9c01e391a931..8862d6b0ef21
>>> 100644 --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -6179,6 +6179,10 @@ const struct iommu_ops intel_iommu_ops = {
>>> .dev_disable_feat   = intel_iommu_dev_disable_feat,
>>> .is_attach_deferred =
>>> intel_iommu_is_attach_deferred, .pgsize_bitmap  =
>>> INTEL_IOMMU_PGSIZES, +#ifdef CONFIG_INTEL_IOMMU_SVM
>>> +   .sva_bind_gpasid= intel_svm_bind_gpasid,
>>> +   .sva_unbind_gpasid  = intel_svm_unbind_gpasid,
>>> +#endif
>>>  };
>>>
>>>  static void quirk_iommu_igfx(struct pci_dev *dev)
>>> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
>>> index 2998418f0a38..69b2070b843d 100644
>>> --- a/drivers/iommu/intel-svm.c
>>> +++ b/drivers/iommu/intel-svm.c
>>> @@ -226,6 +226,210 @@ static LIST_HEAD(global_svm_list);
>>> list_for_each_entry((sdev), &(svm)->devs, list) \
>>> if ((d) != (sdev)->dev) {} else
>>>
>>> +static inline void intel_svm_free_if_empty(struct intel_svm *svm,
>>> u64 pasid) +{
>>> +   if (list_empty(&svm->devs)) {
>>> +   ioasid_set_data(pasid, NULL);
>>> +   kfree(svm);
>>> +   }
>>> +}  
>>
>> Do we really need a function form instead of putting the 4 lines
>> directly after the 'out' label?
>>
> it is more readable and good for code sharing.
That's my fault: I suggested to add this helper because I noticed this
was repeated several times in the code. But adding a new goto label to
handle that job is identical.


> 
>>> +
>>> +int intel_svm_bind_gpasid(struct iommu_domain *domain, struct
>>> device *dev,
>>> + struct iommu_gpasid_bind_data *data)
>>> +{
>>> +   struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
>>> +   struct dmar_domain *dmar_domain;
>>> +   struct intel_svm_dev *sdev;
>>> +   struct intel_svm *svm;
>>> +   int ret = 0;
>>> +
>>> +   if (WARN_ON(!iommu) || !data)
>>> +   return -EINVAL;  
>>
>> well, why not checking !dev together?
> This is kernel API, unlike iommu and data caller fills in dev directly.
> 
>>
>>> +
>>> +   if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
>>> +   data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
>>> +   return -EINVAL;
>>> +
>>> +   if (dev_is_pci(dev)) {
>>> +   /* VT-d supports devices with full 20 bit PASIDs
>>> only */
>>> +   if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX)
>>> +   return -EINVAL;
>>> +   } else {
>>> +   return -ENOTSUPP;
>>> +   }
>>> +
>>> +   /*
>>> +* We

Re: [PATCH v12 4/8] iommu/vt-d: Add bind guest PASID support

2020-04-27 Thread Jacob Pan
On Fri, 24 Apr 2020 10:47:45 +
"Tian, Kevin"  wrote:

> > From: Jacob Pan 
> > Sent: Wednesday, April 22, 2020 2:53 AM
> > 
> > When supporting guest SVA with emulated IOMMU, the guest PASID
> > table is shadowed in VMM. Updates to guest vIOMMU PASID table
> > will result in PASID cache flush which will be passed down to
> > the host as bind guest PASID calls.  
> 
> Above description is not accurate. Guest PASID table updates don't
> 'result in' PASID cache flush automatically. What about:
> --
> The guest needs to invalidate the PASID cache for any update to
> guest PASID table. Those invalidation requests are intercepted
> by the VMM and passed down to the host as binding guest PASID
> calls.
> --
It is good to add more details, thanks.

> > 
> > For the SL page tables, it will be harvested from device's
> > default domain (request w/o PASID), or aux domain in case of
> > mediated device.
> > 
> > .-.  .---.
> > |   vIOMMU|  | Guest process CR3, FL only|
> > | |  '---'
> > ./
> > | PASID Entry |--- PASID cache flush -
> > '-'   |
> > | |   V
> > | |CR3 in GPA
> > '-'
> > Guest
> > --| Shadow |--|
> >   vv  v
> > Host
> > .-.  .--.
> > |   pIOMMU|  | Bind FL for GVA-GPA  |
> > | |  '--'
> > ./  |
> > | PASID Entry | V (Nested xlate)
> > '\.--.
> > | |   |SL for GPA-HPA, default domain|
> > | |   '--'
> > '-'
> > Where:
> >  - FL = First level/stage one page tables
> >  - SL = Second level/stage two page tables
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Liu, Yi L 
> > ---
> >  drivers/iommu/intel-iommu.c |   4 +
> >  drivers/iommu/intel-svm.c   | 204
> > 
> >  include/linux/intel-iommu.h |   8 +-
> >  include/linux/intel-svm.h   |  17 
> >  4 files changed, 232 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index 9c01e391a931..8862d6b0ef21
> > 100644 --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -6179,6 +6179,10 @@ const struct iommu_ops intel_iommu_ops = {
> > .dev_disable_feat   = intel_iommu_dev_disable_feat,
> > .is_attach_deferred =
> > intel_iommu_is_attach_deferred, .pgsize_bitmap  =
> > INTEL_IOMMU_PGSIZES, +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +   .sva_bind_gpasid= intel_svm_bind_gpasid,
> > +   .sva_unbind_gpasid  = intel_svm_unbind_gpasid,
> > +#endif
> >  };
> > 
> >  static void quirk_iommu_igfx(struct pci_dev *dev)
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 2998418f0a38..69b2070b843d 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -226,6 +226,210 @@ static LIST_HEAD(global_svm_list);
> > list_for_each_entry((sdev), &(svm)->devs, list) \
> > if ((d) != (sdev)->dev) {} else
> > 
> > +static inline void intel_svm_free_if_empty(struct intel_svm *svm,
> > u64 pasid) +{
> > +   if (list_empty(&svm->devs)) {
> > +   ioasid_set_data(pasid, NULL);
> > +   kfree(svm);
> > +   }
> > +}  
> 
> Do we really need a function form instead of putting the 4 lines
> directly after the 'out' label?
> 
it is more readable and good for code sharing.

> > +
> > +int intel_svm_bind_gpasid(struct iommu_domain *domain, struct
> > device *dev,
> > + struct iommu_gpasid_bind_data *data)
> > +{
> > +   struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > +   struct dmar_domain *dmar_domain;
> > +   struct intel_svm_dev *sdev;
> > +   struct intel_svm *svm;
> > +   int ret = 0;
> > +
> > +   if (WARN_ON(!iommu) || !data)
> > +   return -EINVAL;  
> 
> well, why not checking !dev together?
This is kernel API, unlike iommu and data caller fills in dev directly.

> 
> > +
> > +   if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
> > +   data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
> > +   return -EINVAL;
> > +
> > +   if (dev_is_pci(dev)) {
> > +   /* VT-d supports devices with full 20 bit PASIDs
> > only */
> > +   if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX)
> > +   return -EINVAL;
> > +   } else {
> > +   return -ENOTSUPP;
> > +   }
> > +
> > +   /*
> > +* We only check host PASID range, we have no knowledge to
> > check
> > +* guest PASID range.
> > +*/
> > +   if (data->hpasid <= 0 || data->hpasid >= PASID_MAX)
> > +   return -EINVAL;
> > +
> > +   dmar_domain = to_dmar_domai

RE: [PATCH v12 4/8] iommu/vt-d: Add bind guest PASID support

2020-04-24 Thread Tian, Kevin
> From: Jacob Pan 
> Sent: Wednesday, April 22, 2020 2:53 AM
> 
> When supporting guest SVA with emulated IOMMU, the guest PASID
> table is shadowed in VMM. Updates to guest vIOMMU PASID table
> will result in PASID cache flush which will be passed down to
> the host as bind guest PASID calls.

Above description is not accurate. Guest PASID table updates don't
'result in' PASID cache flush automatically. What about:
--
The guest needs to invalidate the PASID cache for any update to
guest PASID table. Those invalidation requests are intercepted
by the VMM and passed down to the host as binding guest PASID
calls.
--
> 
> For the SL page tables, it will be harvested from device's
> default domain (request w/o PASID), or aux domain in case of
> mediated device.
> 
> .-.  .---.
> |   vIOMMU|  | Guest process CR3, FL only|
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |CR3 in GPA
> '-'
> Guest
> --| Shadow |--|
>   vv  v
> Host
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.--.
> | |   |SL for GPA-HPA, default domain|
> | |   '--'
> '-'
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu, Yi L 
> ---
>  drivers/iommu/intel-iommu.c |   4 +
>  drivers/iommu/intel-svm.c   | 204
> 
>  include/linux/intel-iommu.h |   8 +-
>  include/linux/intel-svm.h   |  17 
>  4 files changed, 232 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 9c01e391a931..8862d6b0ef21 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -6179,6 +6179,10 @@ const struct iommu_ops intel_iommu_ops = {
>   .dev_disable_feat   = intel_iommu_dev_disable_feat,
>   .is_attach_deferred = intel_iommu_is_attach_deferred,
>   .pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> + .sva_bind_gpasid= intel_svm_bind_gpasid,
> + .sva_unbind_gpasid  = intel_svm_unbind_gpasid,
> +#endif
>  };
> 
>  static void quirk_iommu_igfx(struct pci_dev *dev)
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 2998418f0a38..69b2070b843d 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -226,6 +226,210 @@ static LIST_HEAD(global_svm_list);
>   list_for_each_entry((sdev), &(svm)->devs, list) \
>   if ((d) != (sdev)->dev) {} else
> 
> +static inline void intel_svm_free_if_empty(struct intel_svm *svm, u64 pasid)
> +{
> + if (list_empty(&svm->devs)) {
> + ioasid_set_data(pasid, NULL);
> + kfree(svm);
> + }
> +}

Do we really need a function form instead of putting the 4 lines directly 
after the 'out' label?

> +
> +int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device
> *dev,
> +   struct iommu_gpasid_bind_data *data)
> +{
> + struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> + struct dmar_domain *dmar_domain;
> + struct intel_svm_dev *sdev;
> + struct intel_svm *svm;
> + int ret = 0;
> +
> + if (WARN_ON(!iommu) || !data)
> + return -EINVAL;

well, why not checking !dev together?

> +
> + if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
> + data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
> + return -EINVAL;
> +
> + if (dev_is_pci(dev)) {
> + /* VT-d supports devices with full 20 bit PASIDs only */
> + if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX)
> + return -EINVAL;
> + } else {
> + return -ENOTSUPP;
> + }
> +
> + /*
> +  * We only check host PASID range, we have no knowledge to check
> +  * guest PASID range.
> +  */
> + if (data->hpasid <= 0 || data->hpasid >= PASID_MAX)
> + return -EINVAL;
> +
> + dmar_domain = to_dmar_domain(domain);
> +
> + mutex_lock(&pasid_mutex);
> + svm = ioasid_find(NULL, data->hpasid, NULL);
> + if (IS_ERR(svm)) {
> + ret = PTR_ERR(svm);
> + goto out;
> + }
> +
> + if (svm) {
> + /*
> +  * If we found svm for the PASID, there must be at
> +  * least one device bond, otherwise svm should be freed.
> +  */
> +