RE: [PATCH v3 02/14] iommu: Report domain nesting info

2020-06-26 Thread Liu, Yi L
Hi Robin,

> From: Robin Murphy 
> Sent: Saturday, June 27, 2020 12:05 AM
> 
> On 2020-06-26 08:47, Jean-Philippe Brucker wrote:
> > On Wed, Jun 24, 2020 at 01:55:15AM -0700, Liu Yi L wrote:
> >> IOMMUs that support nesting translation needs report the capability
> >> info to userspace, e.g. the format of first level/stage paging structures.
> >>
> >> This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can
> >> get nesting info after setting DOMAIN_ATTR_NESTING.
> >>
> >> v2 -> v3:
> >> *) remvoe cap/ecap_mask in iommu_nesting_info.
> >> *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> >> *) return an empty iommu_nesting_info for SMMU drivers per Jean'
> >> suggestion.
> >>
> >> Cc: Kevin Tian 
> >> CC: Jacob Pan 
> >> Cc: Alex Williamson 
> >> Cc: Eric Auger 
> >> Cc: Jean-Philippe Brucker 
> >> Cc: Joerg Roedel 
> >> Cc: Lu Baolu 
> >> Signed-off-by: Liu Yi L 
> >> Signed-off-by: Jacob Pan 
> >> ---
> >>   drivers/iommu/arm-smmu-v3.c | 29 --
> >>   drivers/iommu/arm-smmu.c| 29 --
> >
> > Looks reasonable to me. Please move the SMMU changes to a separate
> > patch and Cc the SMMU maintainers:
> 
> Cheers Jean, I'll admit I've been skipping over a lot of these patches lately 
> :)
> 
> A couple of comments below...
> 
> >
> > Cc: Will Deacon 
> > Cc: Robin Murphy 
> >
> > Thanks,
> > Jean
> >
> >>   include/uapi/linux/iommu.h  | 59
> +
> >>   3 files changed, 113 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/iommu/arm-smmu-v3.c
> >> b/drivers/iommu/arm-smmu-v3.c index f578677..0c45d4d 100644
> >> --- a/drivers/iommu/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm-smmu-v3.c
> >> @@ -3019,6 +3019,32 @@ static struct iommu_group
> *arm_smmu_device_group(struct device *dev)
> >>return group;
> >>   }
> >>
> >> +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain
> *smmu_domain,
> >> +  void *data)
> >> +{
> >> +  struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
> >> +  u32 size;
> >> +
> >> +  if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> >> +  return -ENODEV;
> >> +
> >> +  size = sizeof(struct iommu_nesting_info);
> >> +
> >> +  /*
> >> +   * if provided buffer size is not equal to the size, should
> >> +   * return 0 and also the expected buffer size to caller.
> >> +   */
> >> +  if (info->size != size) {
> >> +  info->size = size;
> >> +  return 0;
> >> +  }
> >> +
> >> +  /* report an empty iommu_nesting_info for now */
> >> +  memset(info, 0x0, size);
> >> +  info->size = size;
> >> +  return 0;
> >> +}
> >> +
> >>   static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> >>enum iommu_attr attr, void *data)
> >>   {
> >> @@ -3028,8 +3054,7 @@ static int arm_smmu_domain_get_attr(struct
> iommu_domain *domain,
> >>case IOMMU_DOMAIN_UNMANAGED:
> >>switch (attr) {
> >>case DOMAIN_ATTR_NESTING:
> >> -  *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> >> -  return 0;
> >> +  return arm_smmu_domain_nesting_info(smmu_domain,
> data);
> >>default:
> >>return -ENODEV;
> >>}
> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >> index 243bc4c..908607d 100644
> >> --- a/drivers/iommu/arm-smmu.c
> >> +++ b/drivers/iommu/arm-smmu.c
> >> @@ -1506,6 +1506,32 @@ static struct iommu_group
> *arm_smmu_device_group(struct device *dev)
> >>return group;
> >>   }
> >>
> >> +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain
> *smmu_domain,
> >> +  void *data)
> >> +{
> >> +  struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
> >> +  u32 size;
> >> +
> >> +  if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> >> +  return -ENODEV;
> >> +
> >> +  size = sizeof(struct iommu_nesting_info);
> >> +
> >> +  /*
> >> +   * if provided buffer size is not equal to the size, should
> >> +   * return 0 and also the expected buffer size to caller.
> >> +   */
> >> +  if (info->size != size) {
> >> +  info->size = size;
> >> +  return 0;
> >> +  }
> >> +
> >> +  /* report an empty iommu_nesting_info for now */
> >> +  memset(info, 0x0, size);
> >> +  info->size = size;
> >> +  return 0;
> >> +}
> >> +
> >>   static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> >>enum iommu_attr attr, void *data)
> >>   {
> >> @@ -1515,8 +1541,7 @@ static int arm_smmu_domain_get_attr(struct
> iommu_domain *domain,
> >>case IOMMU_DOMAIN_UNMANAGED:
> >>switch (attr) {
> >>case DOMAIN_ATTR_NESTING:
> >> -  *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> >> -  return 0;
> >> +  return arm_smmu_domain_nesting_

RE: [PATCH v3 02/14] iommu: Report domain nesting info

2020-06-26 Thread Liu, Yi L
> From: Jean-Philippe Brucker 
> Sent: Friday, June 26, 2020 3:48 PM
> 
> On Wed, Jun 24, 2020 at 01:55:15AM -0700, Liu Yi L wrote:
> > IOMMUs that support nesting translation needs report the capability
> > info to userspace, e.g. the format of first level/stage paging structures.
> >
> > This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
> > nesting info after setting DOMAIN_ATTR_NESTING.
> >
> > v2 -> v3:
> > *) remvoe cap/ecap_mask in iommu_nesting_info.
> > *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> > *) return an empty iommu_nesting_info for SMMU drivers per Jean'
> >suggestion.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/iommu/arm-smmu-v3.c | 29 --
> >  drivers/iommu/arm-smmu.c| 29 --
> 
> Looks reasonable to me. Please move the SMMU changes to a separate patch
> and Cc the SMMU maintainers:
> 
> Cc: Will Deacon 
> Cc: Robin Murphy 

got you. will do it.

Regards,
Yi Liu

> Thanks,
> Jean
> 
> >  include/uapi/linux/iommu.h  | 59
> > +
> >  3 files changed, 113 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > index f578677..0c45d4d 100644
> > --- a/drivers/iommu/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm-smmu-v3.c
> > @@ -3019,6 +3019,32 @@ static struct iommu_group
> *arm_smmu_device_group(struct device *dev)
> > return group;
> >  }
> >
> > +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain
> *smmu_domain,
> > +   void *data)
> > +{
> > +   struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
> > +   u32 size;
> > +
> > +   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> > +   return -ENODEV;
> > +
> > +   size = sizeof(struct iommu_nesting_info);
> > +
> > +   /*
> > +* if provided buffer size is not equal to the size, should
> > +* return 0 and also the expected buffer size to caller.
> > +*/
> > +   if (info->size != size) {
> > +   info->size = size;
> > +   return 0;
> > +   }
> > +
> > +   /* report an empty iommu_nesting_info for now */
> > +   memset(info, 0x0, size);
> > +   info->size = size;
> > +   return 0;
> > +}
> > +
> >  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > enum iommu_attr attr, void *data)  { @@ -
> 3028,8 +3054,7 @@
> > static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > case IOMMU_DOMAIN_UNMANAGED:
> > switch (attr) {
> > case DOMAIN_ATTR_NESTING:
> > -   *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> > -   return 0;
> > +   return arm_smmu_domain_nesting_info(smmu_domain,
> data);
> > default:
> > return -ENODEV;
> > }
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index
> > 243bc4c..908607d 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -1506,6 +1506,32 @@ static struct iommu_group
> *arm_smmu_device_group(struct device *dev)
> > return group;
> >  }
> >
> > +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain
> *smmu_domain,
> > +   void *data)
> > +{
> > +   struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
> > +   u32 size;
> > +
> > +   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> > +   return -ENODEV;
> > +
> > +   size = sizeof(struct iommu_nesting_info);
> > +
> > +   /*
> > +* if provided buffer size is not equal to the size, should
> > +* return 0 and also the expected buffer size to caller.
> > +*/
> > +   if (info->size != size) {
> > +   info->size = size;
> > +   return 0;
> > +   }
> > +
> > +   /* report an empty iommu_nesting_info for now */
> > +   memset(info, 0x0, size);
> > +   info->size = size;
> > +   return 0;
> > +}
> > +
> >  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > enum iommu_attr attr, void *data)  { @@ -
> 1515,8 +1541,7 @@
> > static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > case IOMMU_DOMAIN_UNMANAGED:
> > switch (attr) {
> > case DOMAIN_ATTR_NESTING:
> > -   *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> > -   return 0;
> > +   return arm_smmu_domain_nesting_info(smmu_domain,
> data);
> > default:
> > return -ENODEV;
> > }
> > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > index 1afc661..898c99a 100644
> > --- a/include/uapi/lin

Re: [PATCH 2/2] pci: Add parameter to disable attaching untrusted devices

2020-06-26 Thread Greg Kroah-Hartman
On Fri, Jun 26, 2020 at 11:53:34AM -0700, Rajat Jain wrote:
> a) I think what was decided was introducing a device core "location"
> property that can be exposed to userspace to help it to decide whether
> or not to attach a driver to a device. Yes, that is still the plan.

Great, but this patch ignores that and starts to add policy :(

> (Mild sidenote: userspace may not need to distinguish between internal
> and external devices if it can assume that no internal PCI devices
> will show up after "echo 0 > /sys/bus/pci/drivers_autoprobe". But
> nevertheless...)

It can not assume that.

> b) Note that even with (a) in place, we still need a parameter that
> can ensure that drivers are not bound to external devices at boot,
> *before* userspace gets a chance to disable "drivers_autoprobe".

Why do you think you need that?  I kind of doubt you really want this,
but ick, if you really do, make it a policy decision that you bake into
the kernel as a build option, so that no one else has to use it :)

> https://lkml.org/lkml/2020/6/15/1453

Ick, please use lore.kernel.org, we don't control lkml.org and it's not
all that reliable.

> Is it OK to add such a parameter in device core?

You don't have internal/external/wherever in the driver core yet, so
don't start adding policy before you get that...

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] vfio/type1: Update group->domain after aux attach and detach

2020-06-26 Thread Lu Baolu
Update group->domain whenever an aux-domain is attached to or detached
from a mediated device. Without this change, iommu_get_domain_for_dev()
will be broken for mdev devices.

Fixes: 7bd50f0cd2fd5 ("vfio/type1: Add domain at(de)taching group helpers")
Signed-off-by: Lu Baolu 
---
 drivers/vfio/vfio_iommu_type1.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5e556ac9102a..e0d8802ce0c9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1634,10 +1634,28 @@ static int vfio_mdev_attach_domain(struct device *dev, 
void *data)
 
iommu_device = vfio_mdev_get_iommu_device(dev);
if (iommu_device) {
-   if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
-   return iommu_aux_attach_device(domain, iommu_device);
-   else
+   if (iommu_dev_feature_enabled(iommu_device,
+ IOMMU_DEV_FEAT_AUX)) {
+   struct iommu_group *group = iommu_group_get(dev);
+   int ret;
+
+   if (!group)
+   return -EINVAL;
+
+   if (iommu_group_get_domain(group)) {
+   iommu_group_put(group);
+   return -EBUSY;
+   }
+
+   ret = iommu_aux_attach_device(domain, iommu_device);
+   if (!ret)
+   iommu_group_set_domain(group, domain);
+
+   iommu_group_put(group);
+   return ret;
+   } else {
return iommu_attach_device(domain, iommu_device);
+   }
}
 
return -EINVAL;
@@ -1650,10 +1668,19 @@ static int vfio_mdev_detach_domain(struct device *dev, 
void *data)
 
iommu_device = vfio_mdev_get_iommu_device(dev);
if (iommu_device) {
-   if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
+   if (iommu_dev_feature_enabled(iommu_device,
+ IOMMU_DEV_FEAT_AUX)) {
+   struct iommu_group *group;
+
iommu_aux_detach_device(domain, iommu_device);
-   else
+   group = iommu_group_get(dev);
+   if (group) {
+   iommu_group_set_domain(group, NULL);
+   iommu_group_put(group);
+   }
+   } else {
iommu_detach_device(domain, iommu_device);
+   }
}
 
return 0;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu: Add iommu_group_get/set_domain()

2020-06-26 Thread Lu Baolu
The hardware assistant vfio mediated device is a use case of iommu
aux-domain. The interactions between vfio/mdev and iommu during mdev
creation and passthr are:

- Create a group for mdev with iommu_group_alloc();
- Add the device to the group with
group = iommu_group_alloc();
if (IS_ERR(group))
return PTR_ERR(group);

ret = iommu_group_add_device(group, &mdev->dev);
if (!ret)
dev_info(&mdev->dev, "MDEV: group_id = %d\n",
 iommu_group_id(group));
- Allocate an aux-domain
iommu_domain_alloc()
- Attach the aux-domain to the physical device from which the mdev is
  created.
iommu_aux_attach_device()

In the whole process, an iommu group was allocated for the mdev and an
iommu domain was attached to the group, but the group->domain leaves
NULL. As the result, iommu_get_domain_for_dev() doesn't work anymore.

This adds iommu_group_get/set_domain() so that group->domain could be
managed whenever a domain is attached or detached through the aux-domain
api's.

Fixes: 7bd50f0cd2fd5 ("vfio/type1: Add domain at(de)taching group helpers")
Signed-off-by: Lu Baolu 
---
 drivers/iommu/iommu.c | 28 
 include/linux/iommu.h | 14 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d43120eb1dc5..e2b665303d70 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -715,6 +715,34 @@ int iommu_group_set_name(struct iommu_group *group, const 
char *name)
 }
 EXPORT_SYMBOL_GPL(iommu_group_set_name);
 
+/**
+ * iommu_group_get_domain - get domain of a group
+ * @group: the group
+ *
+ * This is called to get the domain of a group.
+ */
+struct iommu_domain *iommu_group_get_domain(struct iommu_group *group)
+{
+   return group->domain;
+}
+EXPORT_SYMBOL_GPL(iommu_group_get_domain);
+
+/**
+ * iommu_group_set_domain - set domain for a group
+ * @group: the group
+ * @domain: iommu domain
+ *
+ * This is called to set the domain for a group. In aux-domain case, a domain
+ * might attach or detach to an iommu group through the aux-domain apis, but
+ * the group->domain doesn't get a chance to be updated there.
+ */
+void iommu_group_set_domain(struct iommu_group *group,
+   struct iommu_domain *domain)
+{
+   group->domain = domain;
+}
+EXPORT_SYMBOL_GPL(iommu_group_set_domain);
+
 static int iommu_create_device_direct_mappings(struct iommu_group *group,
   struct device *dev)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5f0b7859d2eb..ff88d548a870 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -496,6 +496,9 @@ extern void iommu_group_set_iommudata(struct iommu_group 
*group,
  void *iommu_data,
  void (*release)(void *iommu_data));
 extern int iommu_group_set_name(struct iommu_group *group, const char *name);
+extern struct iommu_domain *iommu_group_get_domain(struct iommu_group *group);
+extern void iommu_group_set_domain(struct iommu_group *group,
+  struct iommu_domain *domain);
 extern int iommu_group_add_device(struct iommu_group *group,
  struct device *dev);
 extern void iommu_group_remove_device(struct device *dev);
@@ -840,6 +843,17 @@ static inline int iommu_group_set_name(struct iommu_group 
*group,
return -ENODEV;
 }
 
+static inline
+struct iommu_domain *iommu_group_get_domain(struct iommu_group *group)
+{
+   return NULL;
+}
+
+static inline void iommu_group_set_domain(struct iommu_group *group,
+ struct iommu_domain *domain)
+{
+}
+
 static inline int iommu_group_add_device(struct iommu_group *group,
 struct device *dev)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/5] irqchip: Allow QCOM_PDC to be loadable as a permanent module

2020-06-26 Thread John Stultz
On Fri, Jun 26, 2020 at 12:42 AM Stephen Boyd  wrote:
>
> Quoting John Stultz (2020-06-24 17:10:37)
> > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c
> > index 6ae9e1f0819d..3fee8b655da1 100644
> > --- a/drivers/irqchip/qcom-pdc.c
> > +++ b/drivers/irqchip/qcom-pdc.c
> > @@ -430,4 +432,33 @@ static int qcom_pdc_init(struct device_node *node, 
> > struct device_node *parent)
> > return ret;
> >  }
> >
> > +#ifdef MODULE
> > +static int qcom_pdc_probe(struct platform_device *pdev)
> > +{
> > +   struct device_node *np = pdev->dev.of_node;
> > +   struct device_node *parent = of_irq_find_parent(np);
> > +
> > +   return qcom_pdc_init(np, parent);
> > +}
> > +
> > +static const struct of_device_id qcom_pdc_match_table[] = {
> > +   { .compatible = "qcom,pdc" },
> > +   {}
> > +};
> > +MODULE_DEVICE_TABLE(of, qcom_pdc_match_table);
> > +
> > +static struct platform_driver qcom_pdc_driver = {
> > +   .probe = qcom_pdc_probe,
> > +   .driver = {
> > +   .name = "qcom-pdc",
> > +   .of_match_table = qcom_pdc_match_table,
> > +   .suppress_bind_attrs = true,
> > +   },
> > +};
> > +module_platform_driver(qcom_pdc_driver);
> > +#else
> >  IRQCHIP_DECLARE(qcom_pdc, "qcom,pdc", qcom_pdc_init);
>
> Is there any reason to use IRQCHIP_DECLARE if this can work as a
> platform device driver?
>

Hey! Thanks so much for the review!

Mostly it was done this way to minimize the change in the non-module
case. But if you'd rather avoid the #ifdefery I'll respin it without.

thanks
-john
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH net] xsk: remove cheap_dma optimization

2020-06-26 Thread Daniel Borkmann

On 6/26/20 3:43 PM, Björn Töpel wrote:

From: Björn Töpel 

When the AF_XDP buffer allocation API was introduced it had an
optimization, "cheap_dma". The idea was that when the umem was DMA
mapped, the pool also checked whether the mapping required a
synchronization (CPU to device, and vice versa). If not, it would be
marked as "cheap_dma" and the synchronization would be elided.

In [1] Christoph points out that the optimization above breaks the DMA
API abstraction, and should be removed. Further, Christoph points out
that optimizations like this should be done within the DMA mapping
core, and not elsewhere.

Unfortunately this has implications for the packet rate
performance. The AF_XDP rxdrop scenario shows a 9% decrease in packets
per second.

[1] https://lore.kernel.org/netdev/20200626074725.ga21...@lst.de/

Cc: Christoph Hellwig 
Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API")
Signed-off-by: Björn Töpel 


Given there is roughly a ~5 weeks window at max where this removal could
still be applied in the worst case, could we come up with a fix / proposal
first that moves this into the DMA mapping core? If there is something that
can be agreed upon by all parties, then we could avoid re-adding the 9%
slowdown. :/

Thanks,
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/5] docs: IOMMU user API

2020-06-26 Thread Alex Williamson
On Tue, 23 Jun 2020 10:03:53 -0700
Jacob Pan  wrote:

> IOMMU UAPI is newly introduced to support communications between guest
> virtual IOMMU and host IOMMU. There has been lots of discussions on how
> it should work with VFIO UAPI and userspace in general.
> 
> This document is indended to clarify the UAPI design and usage. The
> mechenics of how future extensions should be achieved are also covered
> in this documentation.
> 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> ---
>  Documentation/userspace-api/iommu.rst | 244 
> ++
>  1 file changed, 244 insertions(+)
>  create mode 100644 Documentation/userspace-api/iommu.rst
> 
> diff --git a/Documentation/userspace-api/iommu.rst 
> b/Documentation/userspace-api/iommu.rst
> new file mode 100644
> index ..f9e4ed90a413
> --- /dev/null
> +++ b/Documentation/userspace-api/iommu.rst
> @@ -0,0 +1,244 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. iommu:
> +
> +=
> +IOMMU Userspace API
> +=
> +
> +IOMMU UAPI is used for virtualization cases where communications are
> +needed between physical and virtual IOMMU drivers. For native
> +usage, IOMMU is a system device which does not need to communicate
> +with user space directly.
> +
> +The primary use cases are guest Shared Virtual Address (SVA) and
> +guest IO virtual address (IOVA), wherein a virtual IOMMU (vIOMMU) is
> +required to communicate with the physical IOMMU in the host.
> +
> +.. contents:: :local:
> +
> +Functionalities
> +===
> +Communications of user and kernel involve both directions. The
> +supported user-kernel APIs are as follows:
> +
> +1. Alloc/Free PASID
> +2. Bind/unbind guest PASID (e.g. Intel VT-d)
> +3. Bind/unbind guest PASID table (e.g. ARM sMMU)
> +4. Invalidate IOMMU caches
> +5. Service page requests
> +
> +Requirements
> +
> +The IOMMU UAPIs are generic and extensible to meet the following
> +requirements:
> +
> +1. Emulated and para-virtualised vIOMMUs
> +2. Multiple vendors (Intel VT-d, ARM sMMU, etc.)
> +3. Extensions to the UAPI shall not break existing user space
> +
> +Interfaces
> +==
> +Although the data structures defined in IOMMU UAPI are self-contained,
> +there is no user API functions introduced. Instead, IOMMU UAPI is
> +designed to work with existing user driver frameworks such as VFIO.
> +
> +Extension Rules & Precautions
> +-
> +When IOMMU UAPI gets extended, the data structures can *only* be
> +modified in two ways:
> +
> +1. Adding new fields by re-purposing the padding[] field. No size change.
> +2. Adding new union members at the end. May increase in size.
> +
> +No new fields can be added *after* the variable sized union in that it
> +will break backward compatibility when offset moves. In both cases, a
> +new flag must be accompanied with a new field such that the IOMMU
> +driver can process the data based on the new flag. Version field is
> +only reserved for the unlikely event of UAPI upgrade at its entirety.
> +
> +It's *always* the caller's responsibility to indicate the size of the
> +structure passed by setting argsz appropriately.
> +Though at the same time, argsz is user provided data which is not
> +trusted. The argsz field allows the user to indicate how much data
> +they're providing, it's still the kernel's responsibility to validate
> +whether it's correct and sufficient for the requested operation.
> +
> +Compatibility Checking
> +--
> +When IOMMU UAPI extension results in size increase, user such as VFIO
> +has to handle the following cases:
> +
> +1. User and kernel has exact size match
> +2. An older user with older kernel header (smaller UAPI size) running on a
> +   newer kernel (larger UAPI size)
> +3. A newer user with newer kernel header (larger UAPI size) running
> +   on an older kernel.
> +4. A malicious/misbehaving user pass illegal/invalid size but within
> +   range. The data may contain garbage.

What exactly does vfio need to do to handle these?

> +
> +Feature Checking
> +
> +While launching a guest with vIOMMU, it is important to ensure that host
> +can support the UAPI data structures to be used for vIOMMU-pIOMMU
> +communications. Without upfront compatibility checking, future faults
> +are difficult to report even in normal conditions. For example, TLB
> +invalidations should always succeed. There is no architectural way to
> +report back to the vIOMMU if the UAPI data is incompatible. If that
> +happens, in order to protect IOMMU iosolation guarantee, we have to
> +resort to not giving completion status in vIOMMU. This may result in
> +VM hang.
> +
> +For this reason the following IOMMU UAPIs cannot fail:
> +
> +1. Free PASID
> +2. Unbind guest PASID
> +3. Unbind guest PASID table (SMMU)
> +4. Cache invalidate
> +
> +User applications such as QEMU is expected to import kernel UAPI
> +headers. Backward comp

RE: [PATCH V4 0/3] iommu: Add support to change default domain of an iommu group

2020-06-26 Thread Prakhya, Sai Praneeth
Hi Joerg,

> -Original Message-
> From: Prakhya, Sai Praneeth 
> Sent: Thursday, June 4, 2020 6:32 PM
> To: iommu@lists.linux-foundation.org
> Cc: Prakhya, Sai Praneeth ; Christoph Hellwig
> ; Joerg Roedel ; Raj, Ashok
> ; Will Deacon ; Lu Baolu
> ; Mehta, Sohil ; Robin
> Murphy ; Jacob Pan 
> Subject: [PATCH V4 0/3] iommu: Add support to change default domain of an
> iommu group
> 
> Presently, the default domain of an iommu group is allocated during boot time
> and it cannot be changed later. So, the device would typically be either in
> identity (pass_through) mode or the device would be in DMA mode as long as
> the system is up and running. There is no way to change the default domain 
> type
> dynamically i.e. after booting, a device cannot switch between identity mode
> and DMA mode.
> 
> Assume a use case wherein the privileged user would want to use the device in
> pass-through mode when the device is used for host so that it would be high
> performing. Presently, this is not supported. Hence add support to change the
> default domain of an iommu group dynamically.
> 
> Support this by writing to a sysfs file, namely
> "/sys/kernel/iommu_groups//type".
> 
> Testing:
> 
> Tested by dynamically changing storage device (nvme) from 1. identity mode to
> DMA and making sure file transfer works 2. DMA mode to identity mode and
> making sure file transfer works Tested only for intel_iommu/vt-d. Would
> appreciate if someone could test on AMD and ARM based machines.
> 
> Based on iommu maintainer's 'next' branch.
> 
> Changes from V3:
> 
> 1. Made changes to commit message as suggested by Baolu.
> 2. Don't pass "prev_dom" and "dev" as parameters to
>iommu_change_dev_def_domain(). Instead get them from group.
> 3. Sanitize the logic to validate user default domain type request. The logic
>remains same but is implmented differently.
> 4. Push lot of error checking into iommu_change_dev_def_domain() from
>iommu_group_store_type().
> 5. iommu_change_dev_def_domain() takes/releases group mutex as needed.
> So, it
>shouldn't be called holding a group mutex.
> 6. Use pr_err_ratelimited() instead of pr_err() to avoid DOS attack.

Could you please review this patch set and let me know if you have any comments?

Regards,
Sai
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: the XSK buffer pool needs be to reverted

2020-06-26 Thread Jonathan Lemon
On Fri, Jun 26, 2020 at 09:47:25AM +0200, Christoph Hellwig wrote:
>
> Note that this is somewhat urgent, as various of the APIs that the code
> is abusing are slated to go away for Linux 5.9, so this addition comes
> at a really bad time.

Could you elaborate on what is upcoming here?


Also, on a semi-related note, are there limitations on how many pages
can be left mapped by the iommu?  Some of the page pool work involves
leaving the pages mapped instead of constantly mapping/unmapping them.

On a heavily loaded box with iommu enabled, it seems that quite often
there is contention on the iova_lock.  Are there known issues in this
area?
-- 
Jonathan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH net] xsk: remove cheap_dma optimization

2020-06-26 Thread Jonathan Lemon
On Fri, Jun 26, 2020 at 03:43:58PM +0200, Björn Töpel wrote:
> From: Björn Töpel 
> 
> When the AF_XDP buffer allocation API was introduced it had an
> optimization, "cheap_dma". The idea was that when the umem was DMA
> mapped, the pool also checked whether the mapping required a
> synchronization (CPU to device, and vice versa). If not, it would be
> marked as "cheap_dma" and the synchronization would be elided.
> 
> In [1] Christoph points out that the optimization above breaks the DMA
> API abstraction, and should be removed. Further, Christoph points out
> that optimizations like this should be done within the DMA mapping
> core, and not elsewhere.
> 
> Unfortunately this has implications for the packet rate
> performance. The AF_XDP rxdrop scenario shows a 9% decrease in packets
> per second.
> 
> [1] https://lore.kernel.org/netdev/20200626074725.ga21...@lst.de/
> 
> Cc: Christoph Hellwig 
> Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API")
> Signed-off-by: Björn Töpel 

Acked-by: Jonathan Lemon 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2

2020-06-26 Thread Jordan Crouse
Support auxiliary domains for arm-smmu-v2 to initialize and support
multiple pagetables for a single SMMU context bank. Since the smmu-v2
hardware doesn't have any built in support for switching the pagetable
base it is left as an exercise to the caller to actually use the pagetable.

Aux domains are supported if split pagetable (TTBR1) support has been
enabled on the master domain.  Each auxiliary domain will reuse the
configuration of the master domain. By default the a domain with TTBR1
support will have the TTBR0 region disabled so the first attached aux
domain will enable the TTBR0 region in the hardware and conversely the
last domain to be detached will disable TTBR0 translations.  All subsequent
auxiliary domains create a pagetable but not touch the hardware.

The leaf driver will be able to query the physical address of the
pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
address with whatever means it has to switch the pagetable base.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
 iommu = iommu_domain_alloc(...);

 if (iommu_aux_attach_device(domain, dev))
 return FAIL;

/* Save the base address of the pagetable for use by the driver
iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

Then 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu.c | 219 ---
 drivers/iommu/arm-smmu.h |   1 +
 2 files changed, 204 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 060139452c54..ce6d654301bf 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -91,6 +91,7 @@ struct arm_smmu_cb {
u32 tcr[2];
u32 mair[2];
struct arm_smmu_cfg *cfg;
+   atomic_taux;
 };
 
 struct arm_smmu_master_cfg {
@@ -667,6 +668,86 @@ static void arm_smmu_write_context_bank(struct 
arm_smmu_device *smmu, int idx)
arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
 
+/*
+ * Update the context context bank to enable TTBR0. Assumes AARCH64 S1
+ * configuration.
+ */
+static void arm_smmu_context_set_ttbr0(struct arm_smmu_cb *cb,
+   struct io_pgtable_cfg *pgtbl_cfg)
+{
+   u32 tcr = cb->tcr[0];
+
+   /* Add the TCR configuration from the new pagetable config */
+   tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+
+   /* Make sure that both TTBR0 and TTBR1 are enabled */
+   tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+   /* Udate the TCR register */
+   cb->tcr[0] = tcr;
+
+   /* Program the new TTBR0 */
+   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+}
+
+/*
+ * Thus function assumes that the current model only allows aux domains for
+ * AARCH64 S1 configurations
+ */
+static int arm_smmu_aux_init_domain_context(struct iommu_domain *domain,
+   struct arm_smmu_device *smmu, struct arm_smmu_cfg *master)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct io_pgtable_ops *pgtbl_ops;
+   struct io_pgtable_cfg pgtbl_cfg;
+
+   mutex_lock(&smmu_domain->init_mutex);
+
+   /* Copy the configuration from the master */
+   memcpy(&smmu_domain->cfg, master, sizeof(smmu_domain->cfg));
+
+   smmu_domain->flush_ops = &arm_smmu_s1_tlb_ops;
+   smmu_domain->smmu = smmu;
+
+   pgtbl_cfg = (struct io_pgtable_cfg) {
+   .pgsize_bitmap = smmu->pgsize_bitmap,
+   .ias = smmu->va_size,
+   .oas = smmu->ipa_size,
+   .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
+   .tlb = smmu_domain->flush_ops,
+   .iommu_dev = smmu->dev,
+   .quirks = 0,
+   };
+
+   if (smmu_domain->non_strict)
+   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+
+   pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1, &pgtbl_cfg,
+   smmu_domain);
+   if (!pgtbl_ops) {
+   mutex_unlock(&smmu_domain->init_mutex);
+   return -ENOMEM;
+   }
+
+   domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+
+   domain->geometry.aperture_end = (1UL << smmu->va_size) - 1;
+   domain->geometry.force_aperture = true;
+
+   /* enable TTBR0 when the the first aux domain is attached */
+   if (atomic_inc_return(&smmu->cbs[master->cbndx].aux) == 1) {
+   arm_smmu_context_set_ttbr0(&smmu->cbs[master->cbndx],
+   &pgtbl_cfg);
+   arm_smmu_write_context_bank(smmu, master->cbndx);
+   }
+
+   smmu_domain->pgtbl_ops = pgtbl_ops;
+   mutex_unl

[PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables

2020-06-26 Thread Jordan Crouse


This is a new refresh of support for auxiliary domains for arm-smmu-v2
and per-instance pagetables for drm/msm. The big change here from past
efforts is that outside of creating a single aux-domain to enable TTBR0
all of the per-instance pagetables are created and managed exclusively
in drm/msm without involving the arm-smmu driver. This fits in with the
suggested model of letting the GPU hardware do what it needs and leave the
arm-smmu driver blissfully unaware.

Almost. In order to set up the io-pgtable properly in drm/msm we need to
query the pagetable configuration from the current active domain and we need to
rely on the iommu API to flush TLBs after a unmap. In the future we can optimize
this in the drm/msm driver to track the state of the TLBs but for now the big
hammer lets us get off the ground.

This series is built on the split pagetable support [1].

[1] https://patchwork.kernel.org/patch/11628543/

v2: Remove unneeded cruft in the a6xx page switch sequence

Jordan Crouse (6):
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  iommu/io-pgtable: Allow a pgtable implementation to skip TLB
operations
  iommu/arm-smmu: Add a domain attribute to pass the pagetable config
  drm/msm: Add support to create a local pagetable
  drm/msm: Add support for address space instances
  drm/msm/a6xx: Add support for per-instance pagetables

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  43 +
 drivers/gpu/drm/msm/msm_drv.c |  15 +-
 drivers/gpu/drm/msm/msm_drv.h |   4 +
 drivers/gpu/drm/msm/msm_gem_vma.c |   9 +
 drivers/gpu/drm/msm/msm_gpu.c |  17 ++
 drivers/gpu/drm/msm/msm_gpu.h |   5 +
 drivers/gpu/drm/msm/msm_gpummu.c  |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c   | 180 +++-
 drivers/gpu/drm/msm/msm_mmu.h |  16 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h  |   1 +
 drivers/iommu/arm-smmu.c  | 231 --
 drivers/iommu/arm-smmu.h  |   1 +
 include/linux/io-pgtable.h|  11 +-
 include/linux/iommu.h |   1 +
 14 files changed, 507 insertions(+), 29 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/6] drm/msm: Add support for address space instances

2020-06-26 Thread Jordan Crouse
Add support for allocating an address space instance. Targets that support
per-instance pagetables should implement their own function to allocate a
new instance. The default will return the existing generic address space.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/msm_drv.c | 15 +--
 drivers/gpu/drm/msm/msm_drv.h |  4 
 drivers/gpu/drm/msm/msm_gem_vma.c |  9 +
 drivers/gpu/drm/msm/msm_gpu.c | 17 +
 drivers/gpu/drm/msm/msm_gpu.h |  5 +
 5 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 6c57cc72d627..092c49552ddd 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -588,7 +588,7 @@ static int context_init(struct drm_device *dev, struct 
drm_file *file)
 
msm_submitqueue_init(dev, ctx);
 
-   ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
+   ctx->aspace = msm_gpu_address_space_instance(priv->gpu);
file->driver_priv = ctx;
 
return 0;
@@ -607,6 +607,8 @@ static int msm_open(struct drm_device *dev, struct drm_file 
*file)
 static void context_close(struct msm_file_private *ctx)
 {
msm_submitqueue_close(ctx);
+
+   msm_gem_address_space_put(ctx->aspace);
kfree(ctx);
 }
 
@@ -771,18 +773,19 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, 
void *data,
 }
 
 static int msm_ioctl_gem_info_iova(struct drm_device *dev,
-   struct drm_gem_object *obj, uint64_t *iova)
+   struct drm_file *file, struct drm_gem_object *obj,
+   uint64_t *iova)
 {
-   struct msm_drm_private *priv = dev->dev_private;
+   struct msm_file_private *ctx = file->driver_priv;
 
-   if (!priv->gpu)
+   if (!ctx->aspace)
return -EINVAL;
 
/*
 * Don't pin the memory here - just get an address so that userspace can
 * be productive
 */
-   return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
+   return msm_gem_get_iova(obj, ctx->aspace, iova);
 }
 
 static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
@@ -821,7 +824,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
args->value = msm_gem_mmap_offset(obj);
break;
case MSM_INFO_GET_IOVA:
-   ret = msm_ioctl_gem_info_iova(dev, obj, &args->value);
+   ret = msm_ioctl_gem_info_iova(dev, file, obj, &args->value);
break;
case MSM_INFO_SET_NAME:
/* length check should leave room for terminating null: */
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index e2d6a6056418..983a8b7e5a74 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -249,6 +249,10 @@ int msm_gem_map_vma(struct msm_gem_address_space *aspace,
 void msm_gem_close_vma(struct msm_gem_address_space *aspace,
struct msm_gem_vma *vma);
 
+
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace);
+
 void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
 
 struct msm_gem_address_space *
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c 
b/drivers/gpu/drm/msm/msm_gem_vma.c
index 5f6a11211b64..29cc1305cf37 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -27,6 +27,15 @@ void msm_gem_address_space_put(struct msm_gem_address_space 
*aspace)
kref_put(&aspace->kref, msm_gem_address_space_destroy);
 }
 
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace)
+{
+   if (!IS_ERR_OR_NULL(aspace))
+   kref_get(&aspace->kref);
+
+   return aspace;
+}
+
 /* Actually unmap memory for the vma */
 void msm_gem_purge_vma(struct msm_gem_address_space *aspace,
struct msm_gem_vma *vma)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 86a138641477..0fa614430799 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -821,6 +821,23 @@ static int get_clocks(struct platform_device *pdev, struct 
msm_gpu *gpu)
return 0;
 }
 
+/* Return a new address space instance */
+struct msm_gem_address_space *
+msm_gpu_address_space_instance(struct msm_gpu *gpu)
+{
+   if (!gpu)
+   return NULL;
+
+   /*
+* If the GPU doesn't support instanced address spaces return the
+* default address space
+*/
+   if (!gpu->funcs->address_space_instance)
+   return msm_gem_address_space_get(gpu->aspace);
+
+   return gpu->funcs->address_space_instance(gpu);
+}
+
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
const char *name, struct msm_gpu_config *config)
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gp

[PATCH v2 4/6] drm/msm: Add support to create a local pagetable

2020-06-26 Thread Jordan Crouse
Add support to create a io-pgtable for use by targets that support
per-instance pagetables.  In order to support per-instance pagetables the
GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
split pagetables and auxiliary domains need to be supported and enabled.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c  | 180 ++-
 drivers/gpu/drm/msm/msm_mmu.h|  16 ++-
 3 files changed, 195 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
index 310a31b05faa..aab121f4beb7 100644
--- a/drivers/gpu/drm/msm/msm_gpummu.c
+++ b/drivers/gpu/drm/msm/msm_gpummu.c
@@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct 
msm_gpu *gpu)
}
 
gpummu->gpu = gpu;
-   msm_mmu_init(&gpummu->base, dev, &funcs);
+   msm_mmu_init(&gpummu->base, dev, &funcs, MSM_MMU_GPUMMU);
 
return &gpummu->base;
 }
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 1b6635504069..f455c597f76d 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -4,15 +4,192 @@
  * Author: Rob Clark 
  */
 
+#include 
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
 struct msm_iommu {
struct msm_mmu base;
struct iommu_domain *domain;
+   struct iommu_domain *aux_domain;
 };
+
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
+struct msm_iommu_pagetable {
+   struct msm_mmu base;
+   struct msm_mmu *parent;
+   struct io_pgtable_ops *pgtbl_ops;
+   phys_addr_t ttbr;
+   u32 asid;
+};
+
+static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
+{
+   return container_of(mmu, struct msm_iommu_pagetable, base);
+}
+
+static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
+   size_t size)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+   size_t unmapped = 0;
+
+   /* Unmap the block one page at a time */
+   while (size) {
+   unmapped += ops->unmap(ops, iova, 4096, NULL);
+   iova += 4096;
+   size -= 4096;
+   }
+
+   iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
+
+   return (unmapped == size) ? 0 : -EINVAL;
+}
+
+static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
+   struct sg_table *sgt, size_t len, int prot)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+   struct scatterlist *sg;
+   size_t mapped = 0;
+   u64 addr = iova;
+   unsigned int i;
+
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   size_t size = sg->length;
+   phys_addr_t phys = sg_phys(sg);
+
+   /* Map the block one page at a time */
+   while (size) {
+   if (ops->map(ops, addr, phys, 4096, prot)) {
+   msm_iommu_pagetable_unmap(mmu, iova, mapped);
+   return -EINVAL;
+   }
+
+   phys += 4096;
+   addr += 4096;
+   size -= 4096;
+   mapped += 4096;
+   }
+   }
+
+   return 0;
+}
+
+static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+
+   free_io_pgtable_ops(pagetable->pgtbl_ops);
+   kfree(pagetable);
+}
+
+/*
+ * Given a parent device, create and return an aux domain. This will enable the
+ * TTBR0 region
+ */
+static struct iommu_domain *msm_iommu_get_aux_domain(struct msm_mmu *parent)
+{
+   struct msm_iommu *iommu = to_msm_iommu(parent);
+   struct iommu_domain *domain;
+   int ret;
+
+   if (iommu->aux_domain)
+   return iommu->aux_domain;
+
+   if (!iommu_dev_has_feature(parent->dev, IOMMU_DEV_FEAT_AUX))
+   return ERR_PTR(-ENODEV);
+
+   domain = iommu_domain_alloc(&platform_bus_type);
+   if (!domain)
+   return ERR_PTR(-ENODEV);
+
+   ret = iommu_aux_attach_device(domain, parent->dev);
+   if (ret) {
+   iommu_domain_free(domain);
+   return ERR_PTR(ret);
+   }
+
+   iommu->aux_domain = domain;
+   return domain;
+}
+
+int msm_iommu_pagetable_params(struct msm_mmu *mmu,
+   phys_addr_t *ttbr, int *asid)
+{
+   struct msm_iommu_pagetable *pagetable;
+
+   if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
+   return -EINVAL;
+
+   pagetable = to_pagetable(mmu);
+
+   if (ttbr)
+   *ttbr = pagetable->ttbr;
+
+   if (asid)
+   *asid = pagetable->asid;
+
+   return 0;
+}
+
+static const struct msm_mmu_funcs pagetable_funcs = {
+

[PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables

2020-06-26 Thread Jordan Crouse
Add support for using per-instance pagetables if all the dependencies are
available.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 43 +++
 drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index aa53f47b7e8b..95ed2ceac121 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -79,6 +79,34 @@ static void get_stats_counter(struct msm_ringbuffer *ring, 
u32 counter,
OUT_RING(ring, upper_32_bits(iova));
 }
 
+static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer 
*ring,
+   struct msm_file_private *ctx)
+{
+   phys_addr_t ttbr;
+   u32 asid;
+
+   if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
+   return;
+
+   /* Execute the table update */
+   OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+   OUT_RING(ring, lower_32_bits(ttbr));
+   OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
+   /* CONTEXTIDR is currently unused */
+   OUT_RING(ring, 0);
+   /* CONTEXTBANK is currently unused */
+   OUT_RING(ring, 0);
+
+   /*
+* Write the new TTBR0 to the memstore. This is good for debugging.
+*/
+   OUT_PKT7(ring, CP_MEM_WRITE, 4);
+   OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+   OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+   OUT_RING(ring, lower_32_bits(ttbr));
+   OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
+}
+
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
struct msm_file_private *ctx)
 {
@@ -89,6 +117,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
struct msm_ringbuffer *ring = submit->ring;
unsigned int i;
 
+   a6xx_set_pagetable(gpu, ring, ctx);
+
get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -872,6 +902,18 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
return (unsigned long)busy_time;
 }
 
+struct msm_gem_address_space *a6xx_address_space_instance(struct msm_gpu *gpu)
+{
+   struct msm_mmu *mmu;
+
+   mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
+   if (IS_ERR(mmu))
+   return msm_gem_address_space_get(gpu->aspace);
+
+   return msm_gem_address_space_create(mmu,
+   "gpu", 0x1ULL, 0x1ULL);
+}
+
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
@@ -895,6 +937,7 @@ static const struct adreno_gpu_funcs funcs = {
.gpu_state_put = a6xx_gpu_state_put,
 #endif
.create_address_space = adreno_iommu_create_address_space,
+   .address_space_instance = a6xx_address_space_instance,
},
.get_timestamp = a6xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h 
b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 7764373d0ed2..0987d6bf848c 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -31,6 +31,7 @@ struct msm_rbmemptrs {
volatile uint32_t fence;
 
volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
+   volatile u64 ttbr0;
 };
 
 struct msm_ringbuffer {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/6] iommu/arm-smmu: Add a domain attribute to pass the pagetable config

2020-06-26 Thread Jordan Crouse
The Adreno GPU has the capacity to manage its own pagetables and switch
them dynamically from the hardware. Add a domain attribute for arm-smmu-v2
to get the default pagetable configuration so that the GPU driver can match
the format for its own pagetables.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu.c | 12 
 include/linux/iommu.h|  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ce6d654301bf..4bd247dfd703 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1714,6 +1714,18 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case DOMAIN_ATTR_NESTING:
*(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
return 0;
+   case DOMAIN_ATTR_PGTABLE_CFG: {
+   struct io_pgtable *pgtable;
+   struct io_pgtable_cfg *dest = data;
+
+   if (!smmu_domain->pgtbl_ops)
+   return -ENODEV;
+
+   pgtable = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+
+   memcpy(dest, &pgtable->cfg, sizeof(*dest));
+   return 0;
+   }
default:
return -ENODEV;
}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5f0b7859d2eb..2388117641f1 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -124,6 +124,7 @@ enum iommu_attr {
DOMAIN_ATTR_FSL_PAMUV1,
DOMAIN_ATTR_NESTING,/* two stages of translation */
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+   DOMAIN_ATTR_PGTABLE_CFG,
DOMAIN_ATTR_MAX,
 };
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations

2020-06-26 Thread Jordan Crouse
Allow a io-pgtable implementation to skip TLB operations by checking for
NULL pointers in the helper functions. It will be up to to the owner
of the io-pgtable instance to make sure that they independently handle
the TLB correctly.

Signed-off-by: Jordan Crouse 
---

 include/linux/io-pgtable.h | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 53d53c6c2be9..bbed1d3925ba 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -210,21 +210,24 @@ struct io_pgtable {
 
 static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
 {
-   iop->cfg.tlb->tlb_flush_all(iop->cookie);
+   if (iop->cfg.tlb)
+   iop->cfg.tlb->tlb_flush_all(iop->cookie);
 }
 
 static inline void
 io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
  size_t size, size_t granule)
 {
-   iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
+   if (iop->cfg.tlb)
+   iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
 }
 
 static inline void
 io_pgtable_tlb_flush_leaf(struct io_pgtable *iop, unsigned long iova,
  size_t size, size_t granule)
 {
-   iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
+   if (iop->cfg.tlb)
+   iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
 }
 
 static inline void
@@ -232,7 +235,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
struct iommu_iotlb_gather * gather, unsigned long iova,
size_t granule)
 {
-   if (iop->cfg.tlb->tlb_add_page)
+   if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 6/7] drm/msm: Set the global virtual address range from the IOMMU domain

2020-06-26 Thread Jordan Crouse
Use the aperture settings from the IOMMU domain to set up the virtual
address range for the GPU. This allows us to transparently deal with
IOMMU side features (like split pagetables).

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 13 +++--
 drivers/gpu/drm/msm/msm_iommu.c |  7 +++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 5db06b590943..3e717c1ebb7f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -192,9 +192,18 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
struct iommu_domain *iommu = iommu_domain_alloc(&platform_bus_type);
struct msm_mmu *mmu = msm_iommu_new(&pdev->dev, iommu);
struct msm_gem_address_space *aspace;
+   u64 start, size;
 
-   aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
-   0x - SZ_16M);
+   /*
+* Use the aperture start or SZ_16M, whichever is greater. This will
+* ensure that we align with the allocated pagetable range while still
+* allowing room in the lower 32 bits for GMEM and whatnot
+*/
+   start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
+   size = iommu->geometry.aperture_end - start + 1;
+
+   aspace = msm_gem_address_space_create(mmu, "gpu",
+   start & GENMASK(48, 0), size);
 
if (IS_ERR(aspace) && !IS_ERR(mmu))
mmu->funcs->destroy(mmu);
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 3a381a9674c9..1b6635504069 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -36,6 +36,10 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
struct msm_iommu *iommu = to_msm_iommu(mmu);
size_t ret;
 
+   /* The arm-smmu driver expects the addresses to be sign extended */
+   if (iova & BIT_ULL(48))
+   iova |= GENMASK_ULL(63, 49);
+
ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
WARN_ON(!ret);
 
@@ -46,6 +50,9 @@ static int msm_iommu_unmap(struct msm_mmu *mmu, uint64_t 
iova, size_t len)
 {
struct msm_iommu *iommu = to_msm_iommu(mmu);
 
+   if (iova & BIT_ULL(48))
+   iova |= GENMASK_ULL(63, 49);
+
iommu_unmap(iommu->domain, iova, len);
 
return 0;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 4/7] iommu/arm-smmu: Add a pointer to the attached device to smmu_domain

2020-06-26 Thread Jordan Crouse
Add a link to the pointer to the struct device that is attached to a
domain. This makes it easy to get the pointer if it is needed in the
implementation specific code.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu.c | 6 --
 drivers/iommu/arm-smmu.h | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 048de2681670..060139452c54 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -668,7 +668,8 @@ static void arm_smmu_write_context_bank(struct 
arm_smmu_device *smmu, int idx)
 }
 
 static int arm_smmu_init_domain_context(struct iommu_domain *domain,
-   struct arm_smmu_device *smmu)
+   struct arm_smmu_device *smmu,
+   struct device *dev)
 {
int irq, start, ret = 0;
unsigned long ias, oas;
@@ -801,6 +802,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
cfg->asid = cfg->cbndx;
 
smmu_domain->smmu = smmu;
+   smmu_domain->dev = dev;
 
pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap  = smmu->pgsize_bitmap,
@@ -1190,7 +1192,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
return ret;
 
/* Ensure that the domain is finalised */
-   ret = arm_smmu_init_domain_context(domain, smmu);
+   ret = arm_smmu_init_domain_context(domain, smmu, dev);
if (ret < 0)
goto rpm_put;
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 5f2de20e883b..d33cfe26b2f5 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -345,6 +345,7 @@ struct arm_smmu_domain {
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
+   struct device   *dev;   /* Device attached to this 
domain */
 };
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 3/7] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

2020-06-26 Thread Jordan Crouse
Every Qcom Adreno GPU has an embedded SMMU for its own use. These
devices depend on unique features such as split pagetables,
different stall/halt requirements and other settings. Identify them
with a compatible string so that they can be identified in the
arm-smmu implementation specific code.

Reviewed-by: Rob Herring 
Signed-off-by: Jordan Crouse 
---

 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index d7ceb4c34423..e52a1b146c97 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -38,6 +38,10 @@ properties:
   - qcom,sc7180-smmu-500
   - qcom,sdm845-smmu-500
   - const: arm,mmu-500
+  - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
+items:
+  - const: qcom,adreno-smmu
+  - const: qcom,smmu-v2
   - items:
   - const: arm,mmu-500
   - const: arm,smmu-v2
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 5/7] iommu/arm-smmu: Add implementation for the adreno GPU SMMU

2020-06-26 Thread Jordan Crouse
Add a special implementation for the SMMU attached to most Adreno GPU
target triggered from the qcom,adreno-gpu-smmu compatible string. When
selected the driver will attempt to enable split pagetables.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu-impl.c |  3 +++
 drivers/iommu/arm-smmu-qcom.c | 45 +--
 drivers/iommu/arm-smmu.h  |  1 +
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index a20e426d81ac..309675cf6699 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -176,5 +176,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
return qcom_smmu_impl_init(smmu);
 
+   if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
+   return qcom_adreno_smmu_impl_init(smmu);
+
return smmu;
 }
diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c
index cf01d0215a39..3248d44ec6d5 100644
--- a/drivers/iommu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm-smmu-qcom.c
@@ -12,6 +12,29 @@ struct qcom_smmu {
struct arm_smmu_device smmu;
 };
 
+static bool qcom_adreno_smmu_is_gpu_device(struct arm_smmu_domain *smmu_domain)
+{
+   return of_device_is_compatible(smmu_domain->dev->of_node, 
"qcom,adreno");
+}
+
+static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg)
+{
+   /* TTBR1 is only for the GPU stream ID and not the GMU */
+   if (!qcom_adreno_smmu_is_gpu_device(smmu_domain))
+   return 0;
+   /*
+* All targets that use the qcom,adreno-smmu compatible string *should*
+* be AARCH64 stage 1 but double check because the arm-smmu code assumes
+* that is the case when the TTBR1 quirk is enabled
+*/
+   if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1) &&
+   (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
+   pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
+
+   return 0;
+}
+
 static const struct of_device_id qcom_smmu_client_of_match[] = {
{ .compatible = "qcom,adreno" },
{ .compatible = "qcom,mdp4" },
@@ -65,7 +88,15 @@ static const struct arm_smmu_impl qcom_smmu_impl = {
.reset = qcom_smmu500_reset,
 };
 
-struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
+static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
+   .init_context = qcom_adreno_smmu_init_context,
+   .def_domain_type = qcom_smmu_def_domain_type,
+   .reset = qcom_smmu500_reset,
+};
+
+
+static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
+   const struct arm_smmu_impl *impl)
 {
struct qcom_smmu *qsmmu;
 
@@ -75,8 +106,18 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct 
arm_smmu_device *smmu)
 
qsmmu->smmu = *smmu;
 
-   qsmmu->smmu.impl = &qcom_smmu_impl;
+   qsmmu->smmu.impl = impl;
devm_kfree(smmu->dev, smmu);
 
return &qsmmu->smmu;
 }
+
+struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+   return qcom_smmu_create(smmu, &qcom_smmu_impl);
+}
+
+struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device 
*smmu)
+{
+   return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
+}
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index d33cfe26b2f5..c417814f1d98 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -466,6 +466,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device 
*smmu, int page,
 
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
 struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
+struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device 
*smmu);
 
 int arm_mmu500_reset(struct arm_smmu_device *smmu);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 0/7] iommu/arm-smmu: Enable split pagetable support

2020-06-26 Thread Jordan Crouse
Another iteration of the split-pagetable support for arm-smmu and the Adreno GPU
SMMU. After email discussions [1] we opted to make a arm-smmu implementation for
specifically for the Adreno GPU and use that to enable split pagetable support
and later other implementation specific bits that we need.

On the hardware side this is very close to the same code from before [2] only
the TTBR1 quirk is turned on by the implementation and not a domain attribute.
In drm/msm we use the returned size of the aperture as a clue to let us know
which virtual address space we should use for global memory objects.

There are two open items that you should be aware of. First, in the
implementation specific code we have to check the compatible string of the
device so that we only enable TTBR1 for the GPU (SID 0) and not the GMU (SID 4).
I went back and forth trying to decide if I wanted to use the compatible string
or the SID as the filter and settled on the compatible string but I could be
talked out of it.

The other open item is that in drm/msm the hardware only uses 49 bits of the
address space but arm-smmu expects the address to be sign extended all the way
to 64 bits. This isn't a problem normally unless you look at the hardware
registers that contain a IOVA and then the upper bits will be zero. I opted to
restrict the internal drm/msm IOVA range to only 49 bits and then sign extend
right before calling iommu_map / iommu_unmap. This is a bit wonky but I thought
that matching the hardware would be less confusing when debugging a hang.

v9: Fix bot-detected merge conflict
v7: Add attached device to smmu_domain to pass to implementation specific
functions

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-May/044537.html
[2] https://patchwork.kernel.org/patch/11482591/


Jordan Crouse (7):
  iommu/arm-smmu: Pass io-pgtable config to implementation specific
function
  iommu/arm-smmu: Add support for split pagetables
  dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
  iommu/arm-smmu: Add a pointer to the attached device to smmu_domain
  iommu/arm-smmu: Add implementation for the adreno GPU SMMU
  drm/msm: Set the global virtual address range from the IOMMU domain
  arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

 .../devicetree/bindings/iommu/arm,smmu.yaml   |  4 ++
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   | 13 +-
 drivers/gpu/drm/msm/msm_iommu.c   |  7 +++
 drivers/iommu/arm-smmu-impl.c |  6 ++-
 drivers/iommu/arm-smmu-qcom.c | 45 ++-
 drivers/iommu/arm-smmu.c  | 38 +++-
 drivers/iommu/arm-smmu.h  | 30 ++---
 8 files changed, 120 insertions(+), 25 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 2/7] iommu/arm-smmu: Add support for split pagetables

2020-06-26 Thread Jordan Crouse
Enable TTBR1 for a context bank if IO_PGTABLE_QUIRK_ARM_TTBR1 is selected
by the io-pgtable configuration.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu.c | 21 -
 drivers/iommu/arm-smmu.h | 25 +++--
 2 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8a3a6c8c887a..048de2681670 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -555,11 +555,15 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
cb->ttbr[1] = 0;
} else {
-   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID,
- cfg->asid);
+   cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+   cfg->asid);
cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
-cfg->asid);
+   cfg->asid);
+
+   if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+   cb->ttbr[1] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   else
+   cb->ttbr[0] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
}
} else {
cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -824,7 +828,14 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
 
/* Update the domain's page sizes to reflect the page table format */
domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
-   domain->geometry.aperture_end = (1UL << ias) - 1;
+
+   if (pgtbl_cfg.quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+   domain->geometry.aperture_start = ~0UL << ias;
+   domain->geometry.aperture_end = ~0UL;
+   } else {
+   domain->geometry.aperture_end = (1UL << ias) - 1;
+   }
+
domain->geometry.force_aperture = true;
 
/* Initialise the context bank with our page table cfg */
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 38b041530a4f..5f2de20e883b 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -168,10 +168,12 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_TCR0x30
 #define ARM_SMMU_TCR_EAE   BIT(31)
 #define ARM_SMMU_TCR_EPD1  BIT(23)
+#define ARM_SMMU_TCR_A1BIT(22)
 #define ARM_SMMU_TCR_TG0   GENMASK(15, 14)
 #define ARM_SMMU_TCR_SH0   GENMASK(13, 12)
 #define ARM_SMMU_TCR_ORGN0 GENMASK(11, 10)
 #define ARM_SMMU_TCR_IRGN0 GENMASK(9, 8)
+#define ARM_SMMU_TCR_EPD0  BIT(7)
 #define ARM_SMMU_TCR_T0SZ  GENMASK(5, 0)
 
 #define ARM_SMMU_VTCR_RES1 BIT(31)
@@ -347,12 +349,23 @@ struct arm_smmu_domain {
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
 {
-   return ARM_SMMU_TCR_EPD1 |
-  FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
-  FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
-  FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
-  FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
-  FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+   u32 tcr = FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+   FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+   FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+   FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+   FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+
+   /*
+   * When TTBR1 is selected shift the TCR fields by 16 bits and disable
+   * translation in TTBR0
+   */
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+   tcr = (tcr << 16) & ~ARM_SMMU_TCR_A1;
+   tcr |= ARM_SMMU_TCR_EPD0;
+   } else
+   tcr |= ARM_SMMU_TCR_EPD1;
+
+   return tcr;
 }
 
 static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 7/7] arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

2020-06-26 Thread Jordan Crouse
Set the qcom,adreno-smmu compatible string for the GPU SMMU to enable
split pagetables.

Signed-off-by: Jordan Crouse 
---

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 8eb5a31346d2..8b15cd74e9ba 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -3556,7 +3556,7 @@
};
 
adreno_smmu: iommu@504 {
-   compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+   compatible = "qcom,adreno-smmu", "qcom,smmu-v2";
reg = <0 0x504 0 0x1>;
#iommu-cells = <1>;
#global-interrupts = <2>;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 1/7] iommu/arm-smmu: Pass io-pgtable config to implementation specific function

2020-06-26 Thread Jordan Crouse
Construct the io-pgtable config before calling the implementation specific
init_context function and pass it so the implementation specific function
can get a chance to change it before the io-pgtable is created.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu-impl.c |  3 ++-
 drivers/iommu/arm-smmu.c  | 11 ++-
 drivers/iommu/arm-smmu.h  |  3 ++-
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index c75b9d957b70..a20e426d81ac 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -68,7 +68,8 @@ static int cavium_cfg_probe(struct arm_smmu_device *smmu)
return 0;
 }
 
-static int cavium_init_context(struct arm_smmu_domain *smmu_domain)
+static int cavium_init_context(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg)
 {
struct cavium_smmu *cs = container_of(smmu_domain->smmu,
  struct cavium_smmu, smmu);
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 243bc4cb2705..8a3a6c8c887a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -797,11 +797,6 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
cfg->asid = cfg->cbndx;
 
smmu_domain->smmu = smmu;
-   if (smmu->impl && smmu->impl->init_context) {
-   ret = smmu->impl->init_context(smmu_domain);
-   if (ret)
-   goto out_unlock;
-   }
 
pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap  = smmu->pgsize_bitmap,
@@ -812,6 +807,12 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
.iommu_dev  = smmu->dev,
};
 
+   if (smmu->impl && smmu->impl->init_context) {
+   ret = smmu->impl->init_context(smmu_domain, &pgtbl_cfg);
+   if (ret)
+   goto out_unlock;
+   }
+
if (smmu_domain->non_strict)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index d172c024be61..38b041530a4f 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -383,7 +383,8 @@ struct arm_smmu_impl {
u64 val);
int (*cfg_probe)(struct arm_smmu_device *smmu);
int (*reset)(struct arm_smmu_device *smmu);
-   int (*init_context)(struct arm_smmu_domain *smmu_domain);
+   int (*init_context)(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *cfg);
void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
 int status);
int (*def_domain_type)(struct device *dev);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [git pull] IOMMU Fixes for Linux v5.8-rc2

2020-06-26 Thread pr-tracker-bot
The pull request you sent on Fri, 26 Jun 2020 16:47:01 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
> tags/iommu-fixes-v5.8-rc2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/bd37cdf8ba1b56a968173560314398d5d3b2d37a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/arm-smmu: Mark qcom_smmu_client_of_match as possibly unused

2020-06-26 Thread Jordan Crouse
On Mon, Jun 08, 2020 at 04:13:08PM +0100, Will Deacon wrote:
> On Thu, Jun 04, 2020 at 02:39:04PM -0600, Jordan Crouse wrote:
> > When CONFIG_OF=n of_match_device() gets pre-processed out of existence
> > leaving qcom-smmu_client_of_match unused. Mark it as possibly unused to
> > keep the compiler from warning in that case.
> > 
> > Fixes: 0e764a01015d ("iommu/arm-smmu: Allow client devices to select direct 
> > mapping")
> > Reported-by: kbuild test robot 
> > Acked-by: Will Deacon 
> > Signed-off-by: Jordan Crouse 
> > ---
> > 
> >  drivers/iommu/arm-smmu-qcom.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c
> > index cf01d0215a39..be4318044f96 100644
> > --- a/drivers/iommu/arm-smmu-qcom.c
> > +++ b/drivers/iommu/arm-smmu-qcom.c
> > @@ -12,7 +12,7 @@ struct qcom_smmu {
> > struct arm_smmu_device smmu;
> >  };
> >  
> > -static const struct of_device_id qcom_smmu_client_of_match[] = {
> > +static const struct of_device_id qcom_smmu_client_of_match[] 
> > __maybe_unused = {
> > { .compatible = "qcom,adreno" },
> > { .compatible = "qcom,mdp4" },
> > { .compatible = "qcom,mdss" },
> 
> Thanks. Joerg -- can you pick this one up, please? I don't have any other
> SMMU fixes pending at the moment.
> 
> Cheers,
> 
> Will

Quick ping to pick this up for 5.8 fixes.

Thanks,
Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] pci: Add pci device even if the driver failed to attach

2020-06-26 Thread Rajat Jain via iommu
On Fri, Jun 26, 2020 at 8:39 AM Bjorn Helgaas  wrote:
>
> Nit: when you update these patches, can you run "git log --oneline
> drivers/pci/bus.c" and make your subject lines match the convention?

Sorry, will do.

> E.g.,
>
>   PCI: Add device even if driver attach failed
>
> On Thu, Jun 25, 2020 at 05:27:09PM -0700, Rajat Jain wrote:
> > device_attach() returning failure indicates a driver error
> > while trying to probe the device. In such a scenario, the PCI
> > device should still be added in the system and be visible to
> > the user.
>
> Nit: please wrap logs to fill 75 characters.  "git log" adds 4 spaces
> at the beginning, so 75+4 still fits nicely in 80 columns without
> wrapping.

Sorry, will do.

>
> > This patch partially reverts:
> > commit ab1a187bba5c ("PCI: Check device_attach() return value always")
> >
> > Signed-off-by: Rajat Jain 
> > ---
> >  drivers/pci/bus.c | 6 +-
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> > index 8e40b3e6da77d..3cef835b375fd 100644
> > --- a/drivers/pci/bus.c
> > +++ b/drivers/pci/bus.c
> > @@ -322,12 +322,8 @@ void pci_bus_add_device(struct pci_dev *dev)
> >
> >   dev->match_driver = true;
> >   retval = device_attach(&dev->dev);
> > - if (retval < 0 && retval != -EPROBE_DEFER) {
> > + if (retval < 0 && retval != -EPROBE_DEFER)
> >   pci_warn(dev, "device attach failed (%d)\n", retval);
> > - pci_proc_detach_device(dev);
> > - pci_remove_sysfs_dev_files(dev);
>
> Thanks for catching my bug!
>
> > - return;
> > - }
> >
> >   pci_dev_assign_added(dev, true);
> >  }
> > --
> > 2.27.0.212.ge8ba1cc988-goog
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] pci: Add parameter to disable attaching untrusted devices

2020-06-26 Thread Rajat Jain via iommu
Hello,

Thanks for taking a look.

On Fri, Jun 26, 2020 at 7:18 AM Greg Kroah-Hartman
 wrote:
>
> On Thu, Jun 25, 2020 at 05:27:10PM -0700, Rajat Jain wrote:
> > Introduce a PCI parameter that disables the automatic attachment of
> > untrusted devices to their drivers.
> >
> > Signed-off-by: Rajat Jain 
> > ---
> > Context:
> >
> >   I set out to implement the approach outlined in
> > https://lkml.org/lkml/2020/6/9/1331
> > https://lkml.org/lkml/2020/6/15/1453
> >
> >   But to my surprise, I found that the new hotplugged PCI devices
> >   were getting automatically attached to drivers even though
> >   /sys/bus/pci/drivers_autoprobe was set to 0.
> >
> >   I realized that the device core's "drivers_autoprobe":
> >
> >   * only disables the *initial* probe of the device (i.e. from
> > device_add()). If a subsystem calls device_attach() explicitly
> > for its devices like PCI subsystem does, the drivers_autoprobe
> > setting does not matter. The core will attach device to the driver.
> > This looks like correct semantic behavior to me because PCI is
> > explicitly calling device_attach(), which is a way to explicitly
> > ask the core to find and attach a driver for a device.
> >
> >   * "drivers_autoprobe" cannot be controlled at boot time (to restrict
> > any drivers before userspace comes up).
> >
> >   The options I considered were:
> >
> >   1) Change device_attach() so that it takes into consideration the
> >  drivers_autoprobe property. Not sure if this is semantically correct
> >  thing to do though. If I do this, then the only way a driver can
> >  be attached to the drivers would be via userspace
> >  (/sys/bus/pci/drivers/bind) (Good for our use case though!).
>
> This is the correct thing to do here, haven't I been asking you do move
> this logic into the driver core so that all busses can use it?

(please see below)

>
> >   2) Make the drivers_autoprobe property available to PCI to use
> >  (currently it is private to device core). The PCI could use this
> >  to determine whether or not to call device_attach(). This still
> >  leaves the other problem (of not being able to set
> >  drivers_autoprobe via command line open).
>
> Ick, command lines are horrible, don't do that if at all possible.  On
> some systems they are not able to be changed which can be good or bad...

(please see below)

>
> >   3) I found the pci_dev->match_driver, which seemed similar to what I
> >  am trying to do, but can't be controlled from userspace. I considered
> >  populating that field based on drivers_autoprobe (still need (2)).
> >  But the problem is that there is the AMD IOMMU driver which is setting
> >  this independently, so setting the match_driver based on
> >  drivers_autoprobe may not be a good idea. May be we can populate it
> >  for untrusted devicesi, based on the parameter that I'm introducing?
> >
> >   4) This patch was my option 4 that helps fix both the problems for me.
>
> I suggest putting some of the above text in the changelog, as it has a
> lot of good context, while your existing changelog is pretty sparse and
> does not explain anything...

Will do.

>
>
> >
> >  drivers/pci/bus.c | 11 ---
> >  drivers/pci/pci.c |  9 +
> >  drivers/pci/pci.h |  1 +
> >  3 files changed, 18 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> > index 3cef835b375fd..336aeeb4c4ebf 100644
> > --- a/drivers/pci/bus.c
> > +++ b/drivers/pci/bus.c
> > @@ -321,9 +321,14 @@ void pci_bus_add_device(struct pci_dev *dev)
> >   pci_bridge_d3_update(dev);
> >
> >   dev->match_driver = true;
> > - retval = device_attach(&dev->dev);
> > - if (retval < 0 && retval != -EPROBE_DEFER)
> > - pci_warn(dev, "device attach failed (%d)\n", retval);
> > +
> > + if (dev->untrusted && pci_dont_attach_untrusted_devs) {
> > + pci_info(dev, "not attaching untrusted device\n");
> > + } else {
> > + retval = device_attach(&dev->dev);
> > + if (retval < 0 && retval != -EPROBE_DEFER)
> > + pci_warn(dev, "device attach failed (%d)\n", retval);
> > + }
> >
> >   pci_dev_assign_added(dev, true);
> >  }
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index ce096272f52b1..dec1f9ef27d71 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -127,6 +127,13 @@ static bool pcie_ats_disabled;
> >  /* If set, the PCI config space of each device is printed during boot. */
> >  bool pci_early_dump;
> >
> > +/*
> > + * If set, the devices with "untrusted" flag shall not be attached 
> > automatically
> > + * Userspace will need to attach them manually:
> > + * echo   > /sys/bus/pci/drivers//bind
> > + */
> > +bool pci_dont_attach_untrusted_devs;
> > +
> >  bool pci_ats_disabled(void)
> >  {
> >   return pcie_ats_disabled;
> > @@ -6522,6 +6529,8 @@ static int __init pci_setup(char *str

Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID

2020-06-26 Thread Fenghua Yu
Hi, Dave,

On Fri, Jun 26, 2020 at 11:23:12AM -0700, Dave Hansen wrote:
> On 6/26/20 11:10 AM, Luck, Tony wrote:
> > On Fri, Jun 26, 2020 at 11:44:50AM +0200, Peter Zijlstra wrote:
> >> On Thu, Jun 25, 2020 at 01:17:22PM -0700, Fenghua Yu wrote:
> >>
> >>> +static bool fixup_pasid_exception(void)
> >>> +{
> >>> + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> >>> + return false;
> >>> + if (!static_cpu_has(X86_FEATURE_ENQCMD))
> >>> + return false;
> >>
> >> elsewhere you had another variation:
> >>
> >> +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> >> +   return;
> >> +
> >> +   if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
> >> +   return;
> >>
> >> Which is it, and why do we need the CONFIG thing when combined with the
> >> enabled thing?
> > 
> > Do we have a standard way of coding for a feature that depends on multiple
> > other features?  For this case the system needs to both suport the ENQCMD
> > instruction, and also have kernel code that programs the IOMMU.
> 
> Not really a standard one.
> 
> We could setup_clear_cpu_cap(X86_FEATURE_ENQCMD) during boot if we see
> that CONFIG_INTEL_IOMMU_SVM=n or if we don't have a detected IOMMU.
> That would at least get static value patching which would make some of
> the feature checks very cheap.
> 
> That means we can't use ENQCMD at all in the kernel, though.  Is that an
> issue?  Is the CPU feature truly dependent on IOMMU_SVM?

ENQCMD instruction needs bind_mm()/unbind_mm() defined in svm.c
which is only compiled by:
obj-$(CONFIG_INTEL_IOMMU_SVM) += intel/svm.o

So I think ENQCMD instruction is truly dependent on CONFIG_INTEL_IOMMU_SVM.

> 
> > And/or guidance on when to use each of the very somewhat simlar looking
> > boot_cpu_has()
> > static_cpu_has()
> > IS_ENABLED()
> > cpu_feature_enabled(X86_FEATURE_ENQCMD)
> > options?
> 
> cpu_feature_enabled() is by go-to for checking X86_FEATUREs.  It has
> compile-time checking along with static checking.
> 
> If you use cpu_feature_enabled(), and we added:
> 
> #ifdef CONFIG_INTEL_IOMMU_SVM
> # define DISABLE_ENQCMD 0
> #else
> # define DISABLE_ENQCMD   (1 << (X86_FEATURE_ENQCMD & ))
> #endif
> 
> to arch/x86/include/asm/disabled-features.h, then we could check only
> X86_FEATURE_ENQCMD, and we'd get that IS_ENABLED() check for "free".

This makes code simpler and cleaner.

Thanks.

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID

2020-06-26 Thread Dave Hansen
On 6/26/20 11:10 AM, Luck, Tony wrote:
> On Fri, Jun 26, 2020 at 11:44:50AM +0200, Peter Zijlstra wrote:
>> On Thu, Jun 25, 2020 at 01:17:22PM -0700, Fenghua Yu wrote:
>>
>>> +static bool fixup_pasid_exception(void)
>>> +{
>>> +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
>>> +   return false;
>>> +   if (!static_cpu_has(X86_FEATURE_ENQCMD))
>>> +   return false;
>>
>> elsewhere you had another variation:
>>
>> +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
>> +   return;
>> +
>> +   if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
>> +   return;
>>
>> Which is it, and why do we need the CONFIG thing when combined with the
>> enabled thing?
> 
> Do we have a standard way of coding for a feature that depends on multiple
> other features?  For this case the system needs to both suport the ENQCMD
> instruction, and also have kernel code that programs the IOMMU.

Not really a standard one.

We could setup_clear_cpu_cap(X86_FEATURE_ENQCMD) during boot if we see
that CONFIG_INTEL_IOMMU_SVM=n or if we don't have a detected IOMMU.
That would at least get static value patching which would make some of
the feature checks very cheap.

That means we can't use ENQCMD at all in the kernel, though.  Is that an
issue?  Is the CPU feature truly dependent on IOMMU_SVM?

> And/or guidance on when to use each of the very somewhat simlar looking
>   boot_cpu_has()
>   static_cpu_has()
>   IS_ENABLED()
>   cpu_feature_enabled(X86_FEATURE_ENQCMD)
> options?

cpu_feature_enabled() is by go-to for checking X86_FEATUREs.  It has
compile-time checking along with static checking.

If you use cpu_feature_enabled(), and we added:

#ifdef CONFIG_INTEL_IOMMU_SVM
# define DISABLE_ENQCMD 0
#else
# define DISABLE_ENQCMD   (1 << (X86_FEATURE_ENQCMD & ))
#endif

to arch/x86/include/asm/disabled-features.h, then we could check only
X86_FEATURE_ENQCMD, and we'd get that IS_ENABLED() check for "free".
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID

2020-06-26 Thread Fenghua Yu
Hi, Peter,

On Fri, Jun 26, 2020 at 11:44:50AM +0200, Peter Zijlstra wrote:
> On Thu, Jun 25, 2020 at 01:17:22PM -0700, Fenghua Yu wrote:
> 
> > +static bool fixup_pasid_exception(void)
> > +{
> > +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> > +   return false;
> > +   if (!static_cpu_has(X86_FEATURE_ENQCMD))
> > +   return false;
> 
> elsewhere you had another variation:
> 
> +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> +   return;
> +
> +   if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
> +   return;
> 
> Which is it, and why do we need the CONFIG thing when combined with the
> enabled thing?
> 

I will use the second one with cpu_feature_enabled() for both cases.

The CONFIG thing is for compilation time optimization when
CONFIG_INTEL_IOMMU_SVM is not set.

If CONFIG_INTEL_IOMMU_SVM is not set, IS_ENABLED(CONFIG_INTEL_IOMMU_SVM) 
is "false" during compilation time. Then GCC will optimize 
fixup_pasid_execption() to empty and will not define
__fixup_pasid_exception() at all because no one calls it.

If CONFIG_INTEL_IOMMU_SVM is set, IS_ENABLED(...) is always true.
Depending on cpu_feature_enabled(X86_FEATURE_ENQCMD), __fixup_pasid_execption()
will be called or not during run time.

Does it make sense?

Do you want me to define a helper enqcmd_enabled()?

static inline bool enqcmd_enabled(void)
{
   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
   return false;
   if (!static_cpu_has(X86_FEATURE_ENQCMD))
   return false;
   return true;
}

Then both fixup_pasid_execption() and free_pasid() can call it.

static bool fixup_pasid_exception(void)
{
if (!enqcmd_enabled())
return false;

return __fixup_pasid_exception();
}

statis inline void free_pasid(struct m_struct *mm)
{
if (!enqcmd_enabled())
return;

__free_pasid(mm);
}

Please advice.

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID

2020-06-26 Thread Borislav Petkov
On Fri, Jun 26, 2020 at 11:10:00AM -0700, Luck, Tony wrote:
> Do we have a standard way of coding for a feature that depends on multiple
> other features?  For this case the system needs to both suport the ENQCMD
> instruction, and also have kernel code that programs the IOMMU.

Yes, you need both. Consider distros enabling everything so that they
can ship a single kernel image.

> And/or guidance on when to use each of the very somewhat simlar looking
>   boot_cpu_has()
>   static_cpu_has()

See the comment over _static_cpu_has() in arch/x86/include/asm/cpufeature.h

In the exception path you should use the static_ variant.

>   IS_ENABLED()

This is testing CONFIG_ settings, i.e., build time.

>   cpu_feature_enabled()

This too, indirectly. See

arch/x86/include/asm/disabled-features.h

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID

2020-06-26 Thread Luck, Tony
On Fri, Jun 26, 2020 at 11:44:50AM +0200, Peter Zijlstra wrote:
> On Thu, Jun 25, 2020 at 01:17:22PM -0700, Fenghua Yu wrote:
> 
> > +static bool fixup_pasid_exception(void)
> > +{
> > +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> > +   return false;
> > +   if (!static_cpu_has(X86_FEATURE_ENQCMD))
> > +   return false;
> 
> elsewhere you had another variation:
> 
> +   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> +   return;
> +
> +   if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
> +   return;
> 
> Which is it, and why do we need the CONFIG thing when combined with the
> enabled thing?

Do we have a standard way of coding for a feature that depends on multiple
other features?  For this case the system needs to both suport the ENQCMD
instruction, and also have kernel code that programs the IOMMU.

And/or guidance on when to use each of the very somewhat simlar looking
boot_cpu_has()
static_cpu_has()
IS_ENABLED()
cpu_feature_enabled()
options?

-Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/15] Documentation fixes

2020-06-26 Thread Jonathan Corbet
On Tue, 23 Jun 2020 09:08:56 +0200
Mauro Carvalho Chehab  wrote:

> As requested, this is a rebase of a previous series posted on Jan, 15.
> 
> Since then, several patches got merged via other trees or became
> obsolete. There were also 2 patches before that fits better at the
> ReST conversion patchset. So, I'll be sending it on another patch
> series together with the remaining ReST conversions.
> 
> I also added reviews/acks received.
> 
> So, the series reduced from 29 to 15 patches.
> 
> Let's hope b4 would be able to properly handle this one.

Nope.  I don't know what it is about your patch series, but b4 is never
able to put them together.

I've applied the series except for #1, which already went through the -mm
tree.

Thanks,

jon
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/14] iommu: Report domain nesting info

2020-06-26 Thread Robin Murphy

On 2020-06-26 08:47, Jean-Philippe Brucker wrote:

On Wed, Jun 24, 2020 at 01:55:15AM -0700, Liu Yi L wrote:

IOMMUs that support nesting translation needs report the capability info
to userspace, e.g. the format of first level/stage paging structures.

This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
nesting info after setting DOMAIN_ATTR_NESTING.

v2 -> v3:
*) remvoe cap/ecap_mask in iommu_nesting_info.
*) reuse DOMAIN_ATTR_NESTING to get nesting info.
*) return an empty iommu_nesting_info for SMMU drivers per Jean'
suggestion.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
  drivers/iommu/arm-smmu-v3.c | 29 --
  drivers/iommu/arm-smmu.c| 29 --


Looks reasonable to me. Please move the SMMU changes to a separate patch
and Cc the SMMU maintainers:


Cheers Jean, I'll admit I've been skipping over a lot of these patches 
lately :)


A couple of comments below...



Cc: Will Deacon 
Cc: Robin Murphy 

Thanks,
Jean


  include/uapi/linux/iommu.h  | 59 +
  3 files changed, 113 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index f578677..0c45d4d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -3019,6 +3019,32 @@ static struct iommu_group *arm_smmu_device_group(struct 
device *dev)
return group;
  }
  
+static int arm_smmu_domain_nesting_info(struct arm_smmu_domain *smmu_domain,

+   void *data)
+{
+   struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
+   u32 size;
+
+   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -ENODEV;
+
+   size = sizeof(struct iommu_nesting_info);
+
+   /*
+* if provided buffer size is not equal to the size, should
+* return 0 and also the expected buffer size to caller.
+*/
+   if (info->size != size) {
+   info->size = size;
+   return 0;
+   }
+
+   /* report an empty iommu_nesting_info for now */
+   memset(info, 0x0, size);
+   info->size = size;
+   return 0;
+}
+
  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
  {
@@ -3028,8 +3054,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_UNMANAGED:
switch (attr) {
case DOMAIN_ATTR_NESTING:
-   *(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
-   return 0;
+   return arm_smmu_domain_nesting_info(smmu_domain, data);
default:
return -ENODEV;
}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 243bc4c..908607d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1506,6 +1506,32 @@ static struct iommu_group *arm_smmu_device_group(struct 
device *dev)
return group;
  }
  
+static int arm_smmu_domain_nesting_info(struct arm_smmu_domain *smmu_domain,

+   void *data)
+{
+   struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
+   u32 size;
+
+   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -ENODEV;
+
+   size = sizeof(struct iommu_nesting_info);
+
+   /*
+* if provided buffer size is not equal to the size, should
+* return 0 and also the expected buffer size to caller.
+*/
+   if (info->size != size) {
+   info->size = size;
+   return 0;
+   }
+
+   /* report an empty iommu_nesting_info for now */
+   memset(info, 0x0, size);
+   info->size = size;
+   return 0;
+}
+
  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
  {
@@ -1515,8 +1541,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_UNMANAGED:
switch (attr) {
case DOMAIN_ATTR_NESTING:
-   *(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
-   return 0;
+   return arm_smmu_domain_nesting_info(smmu_domain, data);
default:
return -ENODEV;
}
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 1afc661..898c99a 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -332,4 +332,63 @@ struct iommu_gpasid_bind_data {
} vendor;
  };
  
+/*

+ * struct iommu_nesting_info - Information for nesting-capable IOMMU.
+ * 

Re: [PATCH 1/2] pci: Add pci device even if the driver failed to attach

2020-06-26 Thread Bjorn Helgaas
Nit: when you update these patches, can you run "git log --oneline
drivers/pci/bus.c" and make your subject lines match the convention?
E.g.,

  PCI: Add device even if driver attach failed

On Thu, Jun 25, 2020 at 05:27:09PM -0700, Rajat Jain wrote:
> device_attach() returning failure indicates a driver error
> while trying to probe the device. In such a scenario, the PCI
> device should still be added in the system and be visible to
> the user.

Nit: please wrap logs to fill 75 characters.  "git log" adds 4 spaces
at the beginning, so 75+4 still fits nicely in 80 columns without
wrapping.

> This patch partially reverts:
> commit ab1a187bba5c ("PCI: Check device_attach() return value always")
> 
> Signed-off-by: Rajat Jain 
> ---
>  drivers/pci/bus.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 8e40b3e6da77d..3cef835b375fd 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -322,12 +322,8 @@ void pci_bus_add_device(struct pci_dev *dev)
>  
>   dev->match_driver = true;
>   retval = device_attach(&dev->dev);
> - if (retval < 0 && retval != -EPROBE_DEFER) {
> + if (retval < 0 && retval != -EPROBE_DEFER)
>   pci_warn(dev, "device attach failed (%d)\n", retval);
> - pci_proc_detach_device(dev);
> - pci_remove_sysfs_dev_files(dev);

Thanks for catching my bug!

> - return;
> - }
>  
>   pci_dev_assign_added(dev, true);
>  }
> -- 
> 2.27.0.212.ge8ba1cc988-goog
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] xen: introduce xen_vring_use_dma

2020-06-26 Thread Michael S. Tsirkin
On Thu, Jun 25, 2020 at 10:31:27AM -0700, Stefano Stabellini wrote:
> On Wed, 24 Jun 2020, Michael S. Tsirkin wrote:
> > On Wed, Jun 24, 2020 at 02:53:54PM -0700, Stefano Stabellini wrote:
> > > On Wed, 24 Jun 2020, Michael S. Tsirkin wrote:
> > > > On Wed, Jun 24, 2020 at 10:59:47AM -0700, Stefano Stabellini wrote:
> > > > > On Wed, 24 Jun 2020, Michael S. Tsirkin wrote:
> > > > > > On Wed, Jun 24, 2020 at 05:17:32PM +0800, Peng Fan wrote:
> > > > > > > Export xen_swiotlb for all platforms using xen swiotlb
> > > > > > > 
> > > > > > > Use xen_swiotlb to determine when vring should use dma APIs to 
> > > > > > > map the
> > > > > > > ring: when xen_swiotlb is enabled the dma API is required. When 
> > > > > > > it is
> > > > > > > disabled, it is not required.
> > > > > > > 
> > > > > > > Signed-off-by: Peng Fan 
> > > > > > 
> > > > > > Isn't there some way to use VIRTIO_F_IOMMU_PLATFORM for this?
> > > > > > Xen was there first, but everyone else is using that now.
> > > > > 
> > > > > Unfortunately it is complicated and it is not related to
> > > > > VIRTIO_F_IOMMU_PLATFORM :-(
> > > > > 
> > > > > 
> > > > > The Xen subsystem in Linux uses dma_ops via swiotlb_xen to translate
> > > > > foreign mappings (memory coming from other VMs) to physical addresses.
> > > > > On x86, it also uses dma_ops to translate Linux's idea of a physical
> > > > > address into a real physical address (this is unneeded on ARM.)
> > > > > 
> > > > > 
> > > > > So regardless of VIRTIO_F_IOMMU_PLATFORM, dma_ops should be used on 
> > > > > Xen/x86
> > > > > always and on Xen/ARM if Linux is Dom0 (because it has foreign
> > > > > mappings.) That is why we have the if (xen_domain) return true; in
> > > > > vring_use_dma_api.
> > > > 
> > > > VIRTIO_F_IOMMU_PLATFORM makes guest always use DMA ops.
> > > > 
> > > > Xen hack predates VIRTIO_F_IOMMU_PLATFORM so it *also*
> > > > forces DMA ops even if VIRTIO_F_IOMMU_PLATFORM is clear.
> > > >
> > > > Unfortunately as a result Xen never got around to
> > > > properly setting VIRTIO_F_IOMMU_PLATFORM.
> > > 
> > > I don't think VIRTIO_F_IOMMU_PLATFORM would be correct for this because
> > > the usage of swiotlb_xen is not a property of virtio,
> > 
> > 
> > Basically any device without VIRTIO_F_ACCESS_PLATFORM
> > (that is it's name in latest virtio spec, VIRTIO_F_IOMMU_PLATFORM is
> > what linux calls it) is declared as "special, don't follow normal rules
> > for access".
> > 
> > So yes swiotlb_xen is not a property of virtio, but what *is* a property
> > of virtio is that it's not special, just a regular device from DMA POV.
> 
> I am trying to understand what you meant but I think I am missing
> something.
> 
> Are you saying that modern virtio should always have
> VIRTIO_F_ACCESS_PLATFORM, hence use normal dma_ops as any other devices?

I am saying it's a safe default. Clear VIRTIO_F_ACCESS_PLATFORM if you
have some special needs e.g. you are very sure it's ok to bypass DMA
ops, or you need to support a legacy guest (produced in the window
between virtio 1 support in 2014 and support for
VIRTIO_F_ACCESS_PLATFORM in 2016).


> If that is the case, how is it possible that virtio breaks on ARM using
> the default dma_ops? The breakage is not Xen related (except that Xen
> turns dma_ops on). The original message from Peng was:
> 
>   vring_map_one_sg -> vring_use_dma_api
>-> dma_map_page
>  -> __swiotlb_map_page
>   ->swiotlb_map_page
>   ->__dma_map_area(phys_to_virt(dma_to_phys(dev, 
> dev_addr)), size, dir);
>   However we are using per device dma area for rpmsg, phys_to_virt
>   could not return a correct virtual address for virtual address in
>   vmalloc area. Then kernel panic.
> 
> I must be missing something. Maybe it is because it has to do with RPMesg?

I think it's an RPMesg bug, yes.

> 
> > > > > You might have noticed that I missed one possible case above: Xen/ARM
> > > > > DomU :-)
> > > > > 
> > > > > Xen/ARM domUs don't need swiotlb_xen, it is not even initialized. So 
> > > > > if
> > > > > (xen_domain) return true; would give the wrong answer in that case.
> > > > > Linux would end up calling the "normal" dma_ops, not swiotlb-xen, and
> > > > > the "normal" dma_ops fail.
> > > > > 
> > > > > 
> > > > > The solution I suggested was to make the check in vring_use_dma_api 
> > > > > more
> > > > > flexible by returning true if the swiotlb_xen is supposed to be used,
> > > > > not in general for all Xen domains, because that is what the check was
> > > > > really meant to do.
> > > > 
> > > > Why not fix DMA ops so they DTRT (nop) on Xen/ARM DomU? What is wrong 
> > > > with that?
> > > 
> > > swiotlb-xen is not used on Xen/ARM DomU, the default dma_ops are the
> > > ones that are used. So you are saying, why don't we fix the default
> > > dma_ops to work with virtio?
> > > 
> > > It is bad that the default dma_ops crash with virtio, so yes I think it
> > > would be good to f

[git pull] IOMMU Fixes for Linux v5.8-rc2

2020-06-26 Thread Joerg Roedel
Hi Linus,

The following changes since commit 48778464bb7d346b47157d21ffde2af6b2d39110:

  Linux 5.8-rc2 (2020-06-21 15:45:29 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fixes-v5.8-rc2

for you to fetch changes up to 48f0bcfb7aad2c6eb4c1e66476b58475aa14393e:

  iommu/vt-d: Fix misuse of iommu_domain_identity_map() (2020-06-23 10:08:32 
+0200)


IOMMU Fixes for Linux v5.8-rc2:

A couple of Intel VT-d fixes:

- Make Intel SVM code 64bit only. The code uses pgd_t* and the
  IOMMU only supports long-mode page-table formats, so its
  broken on 32bit anyway.

- Make sure GFX quirks in for Intel VT-d are not applied to
  untrusted devices. Those devices might gain full memory access
  otherwise.

- Identity mapping setup fix.

- Fix ACS enabling when Intel IOMMU is off and untrusted devices
  are detected.

- Two smaller fixes for coherency and IO page-table setup


Lu Baolu (5):
  iommu/vt-d: Make Intel SVM code 64-bit only
  iommu/vt-d: Set U/S bit in first level page table by default
  iommu/vt-d: Enable PCI ACS for platform opt in hint
  iommu/vt-d: Update scalable mode paging structure coherency
  iommu/vt-d: Fix misuse of iommu_domain_identity_map()

Rajat Jain (1):
  iommu/vt-d: Don't apply gfx quirks to untrusted devices

 drivers/iommu/Kconfig   |  2 +-
 drivers/iommu/intel/dmar.c  |  3 ++-
 drivers/iommu/intel/iommu.c | 59 +++--
 include/linux/intel-iommu.h |  1 +
 4 files changed, 56 insertions(+), 9 deletions(-)

Please pull.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] pci: Add parameter to disable attaching untrusted devices

2020-06-26 Thread Greg Kroah-Hartman
On Thu, Jun 25, 2020 at 05:27:10PM -0700, Rajat Jain wrote:
> Introduce a PCI parameter that disables the automatic attachment of
> untrusted devices to their drivers.
> 
> Signed-off-by: Rajat Jain 
> ---
> Context:
> 
>   I set out to implement the approach outlined in
> https://lkml.org/lkml/2020/6/9/1331
> https://lkml.org/lkml/2020/6/15/1453
> 
>   But to my surprise, I found that the new hotplugged PCI devices
>   were getting automatically attached to drivers even though
>   /sys/bus/pci/drivers_autoprobe was set to 0.
> 
>   I realized that the device core's "drivers_autoprobe":
> 
>   * only disables the *initial* probe of the device (i.e. from
> device_add()). If a subsystem calls device_attach() explicitly
> for its devices like PCI subsystem does, the drivers_autoprobe
> setting does not matter. The core will attach device to the driver.
> This looks like correct semantic behavior to me because PCI is
> explicitly calling device_attach(), which is a way to explicitly
> ask the core to find and attach a driver for a device.
> 
>   * "drivers_autoprobe" cannot be controlled at boot time (to restrict
> any drivers before userspace comes up).
> 
>   The options I considered were:
> 
>   1) Change device_attach() so that it takes into consideration the
>  drivers_autoprobe property. Not sure if this is semantically correct
>  thing to do though. If I do this, then the only way a driver can
>  be attached to the drivers would be via userspace
>  (/sys/bus/pci/drivers/bind) (Good for our use case though!).

This is the correct thing to do here, haven't I been asking you do move
this logic into the driver core so that all busses can use it?

>   2) Make the drivers_autoprobe property available to PCI to use
>  (currently it is private to device core). The PCI could use this
>  to determine whether or not to call device_attach(). This still
>  leaves the other problem (of not being able to set
>  drivers_autoprobe via command line open).

Ick, command lines are horrible, don't do that if at all possible.  On
some systems they are not able to be changed which can be good or bad...

>   3) I found the pci_dev->match_driver, which seemed similar to what I
>  am trying to do, but can't be controlled from userspace. I considered
>  populating that field based on drivers_autoprobe (still need (2)).
>  But the problem is that there is the AMD IOMMU driver which is setting
>  this independently, so setting the match_driver based on
>  drivers_autoprobe may not be a good idea. May be we can populate it
>  for untrusted devicesi, based on the parameter that I'm introducing?
> 
>   4) This patch was my option 4 that helps fix both the problems for me.

I suggest putting some of the above text in the changelog, as it has a
lot of good context, while your existing changelog is pretty sparse and
does not explain anything...


> 
>  drivers/pci/bus.c | 11 ---
>  drivers/pci/pci.c |  9 +
>  drivers/pci/pci.h |  1 +
>  3 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 3cef835b375fd..336aeeb4c4ebf 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -321,9 +321,14 @@ void pci_bus_add_device(struct pci_dev *dev)
>   pci_bridge_d3_update(dev);
>  
>   dev->match_driver = true;
> - retval = device_attach(&dev->dev);
> - if (retval < 0 && retval != -EPROBE_DEFER)
> - pci_warn(dev, "device attach failed (%d)\n", retval);
> +
> + if (dev->untrusted && pci_dont_attach_untrusted_devs) {
> + pci_info(dev, "not attaching untrusted device\n");
> + } else {
> + retval = device_attach(&dev->dev);
> + if (retval < 0 && retval != -EPROBE_DEFER)
> + pci_warn(dev, "device attach failed (%d)\n", retval);
> + }
>  
>   pci_dev_assign_added(dev, true);
>  }
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ce096272f52b1..dec1f9ef27d71 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -127,6 +127,13 @@ static bool pcie_ats_disabled;
>  /* If set, the PCI config space of each device is printed during boot. */
>  bool pci_early_dump;
>  
> +/*
> + * If set, the devices with "untrusted" flag shall not be attached 
> automatically
> + * Userspace will need to attach them manually:
> + * echo   > /sys/bus/pci/drivers//bind
> + */
> +bool pci_dont_attach_untrusted_devs;
> +
>  bool pci_ats_disabled(void)
>  {
>   return pcie_ats_disabled;
> @@ -6522,6 +6529,8 @@ static int __init pci_setup(char *str)
>   pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
>   } else if (!strncmp(str, "disable_acs_redir=", 18)) {
>   disable_acs_redir_param = str + 18;
> + } else if (!strcmp(str, "dont_attach_untrusted_devs")) {
> + pci_dont_attach_u

Re: [PATCH 1/2] pci: Add pci device even if the driver failed to attach

2020-06-26 Thread Greg Kroah-Hartman
On Thu, Jun 25, 2020 at 05:27:09PM -0700, Rajat Jain wrote:
> device_attach() returning failure indicates a driver error
> while trying to probe the device. In such a scenario, the PCI
> device should still be added in the system and be visible to
> the user.
> 
> This patch partially reverts:
> commit ab1a187bba5c ("PCI: Check device_attach() return value always")
> 
> Signed-off-by: Rajat Jain 
> ---
>  drivers/pci/bus.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 8e40b3e6da77d..3cef835b375fd 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -322,12 +322,8 @@ void pci_bus_add_device(struct pci_dev *dev)
>  
>   dev->match_driver = true;
>   retval = device_attach(&dev->dev);
> - if (retval < 0 && retval != -EPROBE_DEFER) {
> + if (retval < 0 && retval != -EPROBE_DEFER)
>   pci_warn(dev, "device attach failed (%d)\n", retval);
> - pci_proc_detach_device(dev);
> - pci_remove_sysfs_dev_files(dev);
> - return;
> - }

Nice catch, sysfs stuff shouldn't be dependant if a driver is bound to a
device or not.

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH net] xsk: remove cheap_dma optimization

2020-06-26 Thread Björn Töpel
From: Björn Töpel 

When the AF_XDP buffer allocation API was introduced it had an
optimization, "cheap_dma". The idea was that when the umem was DMA
mapped, the pool also checked whether the mapping required a
synchronization (CPU to device, and vice versa). If not, it would be
marked as "cheap_dma" and the synchronization would be elided.

In [1] Christoph points out that the optimization above breaks the DMA
API abstraction, and should be removed. Further, Christoph points out
that optimizations like this should be done within the DMA mapping
core, and not elsewhere.

Unfortunately this has implications for the packet rate
performance. The AF_XDP rxdrop scenario shows a 9% decrease in packets
per second.

[1] https://lore.kernel.org/netdev/20200626074725.ga21...@lst.de/

Cc: Christoph Hellwig 
Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API")
Signed-off-by: Björn Töpel 
---
 include/net/xsk_buff_pool.h | 16 +++--
 net/xdp/xsk_buff_pool.c | 69 ++---
 2 files changed, 6 insertions(+), 79 deletions(-)

diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index a4ff226505c9..3ea9b9654632 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -40,7 +40,6 @@ struct xsk_buff_pool {
u32 headroom;
u32 chunk_size;
u32 frame_len;
-   bool cheap_dma;
bool unaligned;
void *addrs;
struct device *dev;
@@ -77,24 +76,17 @@ static inline dma_addr_t xp_get_frame_dma(struct 
xdp_buff_xsk *xskb)
return xskb->frame_dma;
 }
 
-void xp_dma_sync_for_cpu_slow(struct xdp_buff_xsk *xskb);
 static inline void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb)
 {
-   if (xskb->pool->cheap_dma)
-   return;
-
-   xp_dma_sync_for_cpu_slow(xskb);
+   dma_sync_single_range_for_cpu(xskb->pool->dev, xskb->dma, 0,
+ xskb->pool->frame_len, DMA_BIDIRECTIONAL);
 }
 
-void xp_dma_sync_for_device_slow(struct xsk_buff_pool *pool, dma_addr_t dma,
-size_t size);
 static inline void xp_dma_sync_for_device(struct xsk_buff_pool *pool,
  dma_addr_t dma, size_t size)
 {
-   if (pool->cheap_dma)
-   return;
-
-   xp_dma_sync_for_device_slow(pool, dma, size);
+   dma_sync_single_range_for_device(pool->dev, dma, 0,
+size, DMA_BIDIRECTIONAL);
 }
 
 /* Masks for xdp_umem_page flags.
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 540ed75e4482..c330e5f3aadf 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -2,9 +2,6 @@
 
 #include 
 #include 
-#include 
-#include 
-#include 
 
 #include "xsk_queue.h"
 
@@ -55,7 +52,6 @@ struct xsk_buff_pool *xp_create(struct page **pages, u32 
nr_pages, u32 chunks,
pool->free_heads_cnt = chunks;
pool->headroom = headroom;
pool->chunk_size = chunk_size;
-   pool->cheap_dma = true;
pool->unaligned = unaligned;
pool->frame_len = chunk_size - headroom - XDP_PACKET_HEADROOM;
INIT_LIST_HEAD(&pool->free_list);
@@ -125,48 +121,6 @@ static void xp_check_dma_contiguity(struct xsk_buff_pool 
*pool)
}
 }
 
-static bool __maybe_unused xp_check_swiotlb_dma(struct xsk_buff_pool *pool)
-{
-#if defined(CONFIG_SWIOTLB)
-   phys_addr_t paddr;
-   u32 i;
-
-   for (i = 0; i < pool->dma_pages_cnt; i++) {
-   paddr = dma_to_phys(pool->dev, pool->dma_pages[i]);
-   if (is_swiotlb_buffer(paddr))
-   return false;
-   }
-#endif
-   return true;
-}
-
-static bool xp_check_cheap_dma(struct xsk_buff_pool *pool)
-{
-#if defined(CONFIG_HAS_DMA)
-   const struct dma_map_ops *ops = get_dma_ops(pool->dev);
-
-   if (ops) {
-   return !ops->sync_single_for_cpu &&
-   !ops->sync_single_for_device;
-   }
-
-   if (!dma_is_direct(ops))
-   return false;
-
-   if (!xp_check_swiotlb_dma(pool))
-   return false;
-
-   if (!dev_is_dma_coherent(pool->dev)) {
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) ||   \
-   defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) ||\
-   defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE)
-   return false;
-#endif
-   }
-#endif
-   return true;
-}
-
 int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev,
   unsigned long attrs, struct page **pages, u32 nr_pages)
 {
@@ -195,7 +149,6 @@ int xp_dma_map(struct xsk_buff_pool *pool, struct device 
*dev,
xp_check_dma_contiguity(pool);
 
pool->dev = dev;
-   pool->cheap_dma = xp_check_cheap_dma(pool);
return 0;
 }
 EXPORT_SYMBOL(xp_dma_map);
@@ -280,11 +233,8 @@ struct xdp_buff *xp_alloc(struct xsk_buff_pool *pool)
xskb->xdp.data = xskb->xdp.data_hard_start + XDP_PACKET_HEADROOM;
xskb->xdp.data_meta

Re: the XSK buffer pool needs be to reverted

2020-06-26 Thread Björn Töpel

On 2020-06-26 14:41, Christoph Hellwig wrote:

On Fri, Jun 26, 2020 at 02:22:41PM +0200, Björn Töpel wrote:

[...]


Understood. Wdyt about something in the lines of the diff below? It's
build tested only, but removes all non-dma API usage ("poking
internals"). Would that be a way forward, and then as a next step work
on a solution that would give similar benefits, but something that would
live in the dma mapping core?


Yes, that would solve the immediate issues.



Good. I'll cook a proper patch and post it.


Björn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: the XSK buffer pool needs be to reverted

2020-06-26 Thread Christoph Hellwig
On Fri, Jun 26, 2020 at 02:22:41PM +0200, Björn Töpel wrote:
> Thanks for clarifying that. Let's work on a solution that can reside in
> the dma mapping core.
>
>> The commit seems to have a long dove tail of commits depending on it
>> despite only being a month old, so maybe you can do the revert for now?
>>
>
> Reverting the whole series sounds a bit too much. Let's focus on the
> part that breaks the dma api abstraction. I'm assuming that you're
> referring to the
>
>   static bool xp_check_cheap_dma(struct xsk_buff_pool *pool)
>
> function (and related functions called from that)?

Yes.

>
>> Note that this is somewhat urgent, as various of the APIs that the code
>> is abusing are slated to go away for Linux 5.9, so this addition comes
>> at a really bad time.
>>
>
> Understood. Wdyt about something in the lines of the diff below? It's
> build tested only, but removes all non-dma API usage ("poking
> internals"). Would that be a way forward, and then as a next step work
> on a solution that would give similar benefits, but something that would
> live in the dma mapping core?

Yes, that would solve the immediate issues.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu/amd: Don't use atomic64_t for domain->pt_root

2020-06-26 Thread Qian Cai


> On Jun 26, 2020, at 4:05 AM, Joerg Roedel  wrote:
> 
> a previous discussion pointed out that using atomic64_t for that
> purpose is a bit of overkill. This patch-set replaces it with unsigned
> long and introduces some helpers first to make the change more easy.

BTW, from the previous discussion, Linus mentioned,
 
“
The thing is, the 64-bit atomic reads/writes are very expensive on
32-bit x86. If it was just a native pointer, it would be much cheaper
than an "atomic64_t".
“

However, here we have AMD_IOMMU depend on x86_64, so I am wondering if it makes 
any sense to run this code on 32-bit x86 at all?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: the XSK buffer pool needs be to reverted

2020-06-26 Thread Björn Töpel

On 2020-06-26 09:47, Christoph Hellwig wrote:

Hi Björn,

you addition of the xsk_buff_pool.c APIs in commit 2b43470add8c
("xsk: Introduce AF_XDP buffer allocation API") is unfortunately rather
broken by making lots of assumptions and poking into dma-direct and
swiotlb internals that are of no business to outside users and clearly
marked as such.   I'd be glad to work with your doing something proper
for pools, but that needs proper APIs and probably live in the dma
mapping core, but for that you'd actually need to contact the relevant
maintainers before poking into internals.



Christoph,

Thanks for clarifying that. Let's work on a solution that can reside in
the dma mapping core.


The commit seems to have a long dove tail of commits depending on it
despite only being a month old, so maybe you can do the revert for now?



Reverting the whole series sounds a bit too much. Let's focus on the
part that breaks the dma api abstraction. I'm assuming that you're
referring to the

  static bool xp_check_cheap_dma(struct xsk_buff_pool *pool)

function (and related functions called from that)?


Note that this is somewhat urgent, as various of the APIs that the code
is abusing are slated to go away for Linux 5.9, so this addition comes
at a really bad time.



Understood. Wdyt about something in the lines of the diff below? It's
build tested only, but removes all non-dma API usage ("poking
internals"). Would that be a way forward, and then as a next step work
on a solution that would give similar benefits, but something that would
live in the dma mapping core?

diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index a4ff226505c9..003b172ce0d2 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -40,7 +40,6 @@ struct xsk_buff_pool {
u32 headroom;
u32 chunk_size;
u32 frame_len;
-   bool cheap_dma;
bool unaligned;
void *addrs;
struct device *dev;
@@ -80,9 +79,6 @@ static inline dma_addr_t xp_get_frame_dma(struct 
xdp_buff_xsk *xskb)

 void xp_dma_sync_for_cpu_slow(struct xdp_buff_xsk *xskb);
 static inline void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb)
 {
-   if (xskb->pool->cheap_dma)
-   return;
-
xp_dma_sync_for_cpu_slow(xskb);
 }

@@ -91,9 +87,6 @@ void xp_dma_sync_for_device_slow(struct xsk_buff_pool 
*pool, dma_addr_t dma,

 static inline void xp_dma_sync_for_device(struct xsk_buff_pool *pool,
  dma_addr_t dma, size_t size)
 {
-   if (pool->cheap_dma)
-   return;
-
xp_dma_sync_for_device_slow(pool, dma, size);
 }

diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 540ed75e4482..5714f3711381 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -2,9 +2,6 @@

 #include 
 #include 
-#include 
-#include 
-#include 

 #include "xsk_queue.h"

@@ -55,7 +52,6 @@ struct xsk_buff_pool *xp_create(struct page **pages, 
u32 nr_pages, u32 chunks,

pool->free_heads_cnt = chunks;
pool->headroom = headroom;
pool->chunk_size = chunk_size;
-   pool->cheap_dma = true;
pool->unaligned = unaligned;
pool->frame_len = chunk_size - headroom - XDP_PACKET_HEADROOM;
INIT_LIST_HEAD(&pool->free_list);
@@ -125,48 +121,6 @@ static void xp_check_dma_contiguity(struct 
xsk_buff_pool *pool)

}
 }

-static bool __maybe_unused xp_check_swiotlb_dma(struct xsk_buff_pool *pool)
-{
-#if defined(CONFIG_SWIOTLB)
-   phys_addr_t paddr;
-   u32 i;
-
-   for (i = 0; i < pool->dma_pages_cnt; i++) {
-   paddr = dma_to_phys(pool->dev, pool->dma_pages[i]);
-   if (is_swiotlb_buffer(paddr))
-   return false;
-   }
-#endif
-   return true;
-}
-
-static bool xp_check_cheap_dma(struct xsk_buff_pool *pool)
-{
-#if defined(CONFIG_HAS_DMA)
-   const struct dma_map_ops *ops = get_dma_ops(pool->dev);
-
-   if (ops) {
-   return !ops->sync_single_for_cpu &&
-   !ops->sync_single_for_device;
-   }
-
-   if (!dma_is_direct(ops))
-   return false;
-
-   if (!xp_check_swiotlb_dma(pool))
-   return false;
-
-   if (!dev_is_dma_coherent(pool->dev)) {
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) ||   \
-   defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) ||\
-   defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE)
-   return false;
-#endif
-   }
-#endif
-   return true;
-}
-
 int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev,
   unsigned long attrs, struct page **pages, u32 nr_pages)
 {
@@ -195,7 +149,6 @@ int xp_dma_map(struct xsk_buff_pool *pool, struct 
device *dev,

xp_check_dma_contiguity(pool);

pool->dev = dev;
-   pool->cheap_dma = xp_check_cheap_dma(pool);
return 0;
 }
 EXPORT_SYMBOL(xp_dma_map);
@@ -280,11 +233,9 @@ struct xdp_buff *xp_alloc(str

RE: [PATCH v2 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-26 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org
> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Robin Murphy
> Sent: Thursday, June 25, 2020 11:11 PM
> To: Song Bao Hua (Barry Song) ; h...@lst.de;
> m.szyprow...@samsung.com; w...@kernel.org;
> ganapatrao.kulka...@cavium.com; catalin.mari...@arm.com
> Cc: iommu@lists.linux-foundation.org; Linuxarm ;
> linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org; Jonathan
> Cameron ; Nicolas Saenz Julienne
> ; Steve Capper ; Andrew
> Morton ; Mike Rapoport 
> Subject: Re: [PATCH v2 1/2] dma-direct: provide the ability to reserve
> per-numa CMA
> 
> On 2020-06-25 08:43, Barry Song wrote:
> > This is useful for at least two scenarios:
> > 1. ARM64 smmu will get memory from local numa node, it can save its
> > command queues and page tables locally. Tests show it can decrease
> > dma_unmap latency at lot. For example, without this patch, smmu on
> > node2 will get memory from node0 by calling dma_alloc_coherent(),
> > typically, it has to wait for more than 560ns for the completion of
> > CMD_SYNC in an empty command queue; with this patch, it needs 240ns
> > only.
> > 2. when we set iommu passthrough, drivers will get memory from CMA,
> > local memory means much less latency.
> >
> > Cc: Jonathan Cameron 
> > Cc: Christoph Hellwig 
> > Cc: Marek Szyprowski 
> > Cc: Will Deacon 
> > Cc: Robin Murphy 
> > Cc: Ganapatrao Kulkarni 
> > Cc: Catalin Marinas 
> > Cc: Nicolas Saenz Julienne 
> > Cc: Steve Capper 
> > Cc: Andrew Morton 
> > Cc: Mike Rapoport 
> > Signed-off-by: Barry Song 
> > ---
> >   include/linux/dma-contiguous.h |  4 ++
> >   kernel/dma/Kconfig | 10 
> >   kernel/dma/contiguous.c| 99
> ++
> >   3 files changed, 104 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/dma-contiguous.h
> b/include/linux/dma-contiguous.h
> > index 03f8e98e3bcc..278a80a40456 100644
> > --- a/include/linux/dma-contiguous.h
> > +++ b/include/linux/dma-contiguous.h
> > @@ -79,6 +79,8 @@ static inline void dma_contiguous_set_default(struct
> cma *cma)
> >
> >   void dma_contiguous_reserve(phys_addr_t addr_limit);
> >
> > +void dma_pernuma_cma_reserve(void);
> > +
> >   int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t
> base,
> >phys_addr_t limit, struct cma **res_cma,
> >bool fixed);
> > @@ -128,6 +130,8 @@ static inline void dma_contiguous_set_default(struct
> cma *cma) { }
> >
> >   static inline void dma_contiguous_reserve(phys_addr_t limit) { }
> >
> > +static inline void dma_pernuma_cma_reserve(void) { }
> > +
> >   static inline int dma_contiguous_reserve_area(phys_addr_t size,
> phys_addr_t base,
> >phys_addr_t limit, struct cma **res_cma,
> >bool fixed)
> > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> > index d006668c0027..aeb976b1d21c 100644
> > --- a/kernel/dma/Kconfig
> > +++ b/kernel/dma/Kconfig
> > @@ -104,6 +104,16 @@ config DMA_CMA
> >   if  DMA_CMA
> >   comment "Default contiguous memory area size:"
> >
> > +config CMA_PERNUMA_SIZE_MBYTES
> > +   int "Size in Mega Bytes for per-numa CMA areas"
> > +   depends on NUMA
> > +   default 16 if ARM64
> > +   default 0
> > +   help
> > + Defines the size (in MiB) of the per-numa memory area for Contiguous
> > + Memory Allocator. Every numa node will get a separate CMA with this
> > + size. If the size of 0 is selected, per-numa CMA is disabled.
> > +
> 
> I think this needs to be cleverer than just a static config option.
> Pretty much everything else CMA-related is runtime-configurable to some
> degree, and doing any per-node setup when booting on a single-node
> system would be wasted effort.

I agree some dynamic configuration should be supported to set the size of cma.
It could be a kernel parameter bootargs, or leverage an existing parameter.

For a system with NUMA enabled, but with only one node or actually non-numa, 
the current dma_pernuma_cma_reserve() won't do anything:

void __init dma_pernuma_cma_reserve(void)
{
int nid;

if (!pernuma_size_bytes || nr_online_nodes <= 1)
return;
}

> 
> Since this is conceptually very similar to the existing hugetlb_cma
> implementation I'm also wondering about inconsistency with respect to
> specifying per-node vs. total sizes.

For hugetlbcma, pernuma cma size is: total_size/the number of nodes
It is quite simple.

But for DMA, node0's CMA might support address limitation for old
devices which allocate memory by GFP_DMA(32).

> 
> Another thought, though, is that systems large enough to have multiple
> NUMA nodes tend not to be short on memory, so it might not be
> unreasonable to base this all on whatever size the default area is
> given, and simply have a binary on/off switch to control the per-node
> aspect.

I don't mind applying the def

Re: [PATCH 09/13] x86: Remove dev->archdata.iommu pointer

2020-06-26 Thread Borislav Petkov
On Thu, Jun 25, 2020 at 03:08:32PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> There are no users left, all drivers have been converted to use the
> per-device private pointer offered by IOMMU core.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  arch/x86/include/asm/device.h | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h
> index 49bd6cf3eec9..7c0a52ca2f4d 100644
> --- a/arch/x86/include/asm/device.h
> +++ b/arch/x86/include/asm/device.h
> @@ -3,9 +3,6 @@
>  #define _ASM_X86_DEVICE_H
>  
>  struct dev_archdata {
> -#ifdef CONFIG_IOMMU_API
> - void *iommu; /* hook for IOMMU specific extension */
> -#endif
>  };
>  
>  struct pdev_archdata {
> -- 

Acked-by: Borislav Petkov 

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

2020-06-26 Thread John Garry

On 23/06/2020 14:55, Rikard Falkeborn wrote:
Den tis 23 juni 2020 12:21John Garry > skrev:


On 23/06/2020 10:35, Rikard Falkeborn wrote:
 >
 >     I'd say that GENMASK_INPUT_CHECK() should be able to handle a
l=0 and
 >     h=unsigned value, so I doubt this warn.
 >
 >     Using GENMASK((int)cmdq->q.llq.max_n_shift, 0) resolves it,
but it
 >     looks
 >     like GENMASK_INPUT_CHECK() could be improved.
 >
 >
 > Indeed it could, it is fixed in -next.

ok, thanks for the pointer, but I still see this on today's -next with
this patch:

make W=1 drivers/iommu/arm-smmu-v3.o


Oh, ok thanks for reporting. I guess different gcc versions have 
different behaviour. I guess we'll have to change the comparison to 
(!((h) == (l) || (h) > (l))) instead (not sure I got all parenthesis and 
logic correct but you get the idea).




Yeah, so this looks to fix it:

--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -23,7 +23,8 @@
#include 
#define GENMASK_INPUT_CHECK(h, l) \
   (BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
-   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
+   __builtin_constant_p(!((h) == (l) ||(h) > (l))), !((h) 
== (l) ||(h) > (l)), 0)))

+

We may be able to just use (h) == (l) as the const expr to make it more 
concise, but that may be confusing.


I only tested with my toolchain based on 7.5.0

Thanks,
John

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID

2020-06-26 Thread Peter Zijlstra
On Thu, Jun 25, 2020 at 01:17:22PM -0700, Fenghua Yu wrote:

> +static bool fixup_pasid_exception(void)
> +{
> + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> + return false;
> + if (!static_cpu_has(X86_FEATURE_ENQCMD))
> + return false;

elsewhere you had another variation:

+   if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+   return;
+
+   if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
+   return;

Which is it, and why do we need the CONFIG thing when combined with the
enabled thing?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] pci: Add parameter to disable attaching untrusted devices

2020-06-26 Thread Oliver O'Halloran
On Fri, Jun 26, 2020 at 10:27 AM Rajat Jain  wrote:
>
> Introduce a PCI parameter that disables the automatic attachment of
> untrusted devices to their drivers.
>
> Signed-off-by: Rajat Jain 
> ---
> Context:
>
>   I set out to implement the approach outlined in
> https://lkml.org/lkml/2020/6/9/1331
> https://lkml.org/lkml/2020/6/15/1453
>
>   But to my surprise, I found that the new hotplugged PCI devices
>   were getting automatically attached to drivers even though
>   /sys/bus/pci/drivers_autoprobe was set to 0.
>
>   I realized that the device core's "drivers_autoprobe":
>
>   * only disables the *initial* probe of the device (i.e. from
> device_add()). If a subsystem calls device_attach() explicitly
> for its devices like PCI subsystem does, the drivers_autoprobe
> setting does not matter. The core will attach device to the driver.
> This looks like correct semantic behavior to me because PCI is
> explicitly calling device_attach(), which is a way to explicitly
> ask the core to find and attach a driver for a device.

Right, but we're doing using device_attach() largely because the
driver core doesn't provide any mechanism for deferring the initial
probe. I didn't think there was any deeper reason for it, but while
looking I noticed that the initial probe can be async and
device_attach() forces probing to be synchronous. That has the side
effect of serialising all PCI device probing which might be
intentional to avoid device renaming due to the change in probe order.
Userspace is better at dealing with device names changing now days,
but you might still get some people mad at you for changing it.

>   2) Make the drivers_autoprobe property available to PCI to use
>  (currently it is private to device core). The PCI could use this
>  to determine whether or not to call device_attach(). This still
>  leaves the other problem (of not being able to set
>  drivers_autoprobe via command line open).
>
>   3) I found the pci_dev->match_driver, which seemed similar to what I
>  am trying to do, but can't be controlled from userspace. I considered
>  populating that field based on drivers_autoprobe (still need (2)).
>  But the problem is that there is the AMD IOMMU driver which is setting
>  this independently, so setting the match_driver based on
>  drivers_autoprobe may not be a good idea.

Huh, that's pretty weird. Even with that hack you should be able
trigger the bug they're working around by removing the IOMMU device in
sysfs and doing a rescan. I wouldn't worry much about making
match_device user controllable since you would need to work pretty
hard for it to be an issue.

Oliver
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/amd: Use 'unsigned long' for domain->pt_root

2020-06-26 Thread Joerg Roedel
Hi Qian,

On Thu, Jun 25, 2020 at 11:37:20AM -0400, Qian Cai wrote:
> On Thu, Jun 25, 2020 at 04:52:27PM +0200, Joerg Roedel wrote:
> > -   u64 pt_root = atomic64_read(&domain->pt_root);
> > +   unsigned long pt_root = domain->pt_root;
> 
> The pt_root might be reload later in case of register pressure where the
> compiler decides to not store it as a stack variable, so it needs
> smp_rmb() here to match to the smp_wmb() in
> amd_iommu_domain_set_pt_root() to make the load visiable to all CPUs.
> 
> Then, smp_rmb/wmb() wouldn't be able to deal with data races, so it
> needs,
> 
> unsigned long pt_root = READ_ONCE(domain->pt_root);
> 
> >  
> > pgtable->root = (u64 *)(pt_root & PAGE_MASK);
> > pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
> > @@ -164,7 +164,13 @@ static void amd_iommu_domain_get_pgtable(struct 
> > protection_domain *domain,
> >  
> >  static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, 
> > u64 root)
> >  {
> > -   atomic64_set(&domain->pt_root, root);
> > +   domain->pt_root = root;
> 
> WRITE_ONCE(domain->pt_root, root);

Thanks for your review. I addressed your comments and will send an
updated version shortly.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] iommu/amd: Use 'unsigned long' for domain->pt_root

2020-06-26 Thread Joerg Roedel
From: Joerg Roedel 

Using atomic64_t can be quite expensive, so use unsigned long instead.
This is safe because the write becomes visible atomically.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/iommu.c   | 15 +--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a..f6f102282dda 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -468,7 +468,7 @@ struct protection_domain {
   iommu core code */
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
-   atomic64_t pt_root; /* pgtable root and pgtable mode */
+   unsigned long pt_root;  /* pgtable root and pgtable mode */
int glx;/* Number of levels for GCR3 table */
u64 *gcr3_tbl;  /* Guest CR3 table */
unsigned long flags;/* flags to find out type of domain */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5286ddcfc2f9..aec585f47646 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -156,7 +156,12 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
 static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
 struct domain_pgtable *pgtable)
 {
-   u64 pt_root = atomic64_read(&domain->pt_root);
+   unsigned long pt_root;
+
+   /* Match the barrier in amd_iommu_domain_set_pt_root() */
+   smp_rmb();
+
+   pt_root = READ_ONCE(domain->pt_root);
 
pgtable->root = (u64 *)(pt_root & PAGE_MASK);
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
@@ -164,7 +169,13 @@ static void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
 
 static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 
root)
 {
-   atomic64_set(&domain->pt_root, root);
+   WRITE_ONCE(domain->pt_root, root);
+
+   /*
+* The new value needs to be gobally visible in case pt_root gets
+* cleared, so that the page-table can be safely freed.
+*/
+   smp_wmb();
 }
 
 static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/amd: Add helper functions to update domain->pt_root

2020-06-26 Thread Joerg Roedel
From: Joerg Roedel 

Do not call atomic64_set() directly to update the domain page-table
root and use two new helper functions.

This makes it easier to implement additional work necessary when
the page-table is updated.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 74cca1757172..5286ddcfc2f9 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -162,7 +162,18 @@ static void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
-static u64 amd_iommu_domain_encode_pgtable(u64 *root, int mode)
+static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 
root)
+{
+   atomic64_set(&domain->pt_root, root);
+}
+
+static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
+{
+   amd_iommu_domain_set_pt_root(domain, 0);
+}
+
+static void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
+u64 *root, int mode)
 {
u64 pt_root;
 
@@ -170,7 +181,7 @@ static u64 amd_iommu_domain_encode_pgtable(u64 *root, int 
mode)
pt_root = mode & 7;
pt_root |= (u64)root;
 
-   return pt_root;
+   amd_iommu_domain_set_pt_root(domain, pt_root);
 }
 
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
@@ -1410,7 +1421,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
struct domain_pgtable pgtable;
unsigned long flags;
bool ret = true;
-   u64 *pte, root;
+   u64 *pte;
 
spin_lock_irqsave(&domain->lock, flags);
 
@@ -1438,8 +1449,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
 * Device Table needs to be updated and flushed before the new root can
 * be published.
 */
-   root = amd_iommu_domain_encode_pgtable(pte, pgtable.mode);
-   atomic64_set(&domain->pt_root, root);
+   amd_iommu_domain_set_pgtable(domain, pte, pgtable.mode);
 
ret = true;
 
@@ -2319,7 +2329,7 @@ static void protection_domain_free(struct 
protection_domain *domain)
domain_id_free(domain->id);
 
amd_iommu_domain_get_pgtable(domain, &pgtable);
-   atomic64_set(&domain->pt_root, 0);
+   amd_iommu_domain_clr_pt_root(domain);
free_pagetable(&pgtable);
 
kfree(domain);
@@ -2327,7 +2337,7 @@ static void protection_domain_free(struct 
protection_domain *domain)
 
 static int protection_domain_init(struct protection_domain *domain, int mode)
 {
-   u64 *pt_root = NULL, root;
+   u64 *pt_root = NULL;
 
BUG_ON(mode < PAGE_MODE_NONE || mode > PAGE_MODE_6_LEVEL);
 
@@ -2343,8 +2353,7 @@ static int protection_domain_init(struct 
protection_domain *domain, int mode)
return -ENOMEM;
}
 
-   root = amd_iommu_domain_encode_pgtable(pt_root, mode);
-   atomic64_set(&domain->pt_root, root);
+   amd_iommu_domain_set_pgtable(domain, pt_root, mode);
 
return 0;
 }
@@ -2713,8 +2722,8 @@ void amd_iommu_domain_direct_map(struct iommu_domain *dom)
/* First save pgtable configuration*/
amd_iommu_domain_get_pgtable(domain, &pgtable);
 
-   /* Update data structure */
-   atomic64_set(&domain->pt_root, 0);
+   /* Remove page-table from domain */
+   amd_iommu_domain_clr_pt_root(domain);
 
/* Make changes visible to IOMMUs */
update_domain(domain);
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/2] iommu/amd: Don't use atomic64_t for domain->pt_root

2020-06-26 Thread Joerg Roedel
From: Joerg Roedel 

Hi,

a previous discussion pointed out that using atomic64_t for that
purpose is a bit of overkill. This patch-set replaces it with unsigned
long and introduces some helpers first to make the change more easy.

Qian, can you please test these patches in your environment? You can
trigger any race-condition there pretty reliably :)

Other than that, please review and test.

Regards,

Joerg

Changed to v1:

- Addressed review comments from Qian.

Joerg Roedel (2):
  iommu/amd: Add helper functions to update domain->pt_root
  iommu/amd: Use 'unsigned long' for domain->pt_root

 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/iommu.c   | 44 +
 2 files changed, 33 insertions(+), 13 deletions(-)

-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/14] iommu: Report domain nesting info

2020-06-26 Thread Jean-Philippe Brucker
On Wed, Jun 24, 2020 at 01:55:15AM -0700, Liu Yi L wrote:
> IOMMUs that support nesting translation needs report the capability info
> to userspace, e.g. the format of first level/stage paging structures.
> 
> This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
> nesting info after setting DOMAIN_ATTR_NESTING.
> 
> v2 -> v3:
> *) remvoe cap/ecap_mask in iommu_nesting_info.
> *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> *) return an empty iommu_nesting_info for SMMU drivers per Jean'
>suggestion.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/arm-smmu-v3.c | 29 --
>  drivers/iommu/arm-smmu.c| 29 --

Looks reasonable to me. Please move the SMMU changes to a separate patch
and Cc the SMMU maintainers:

Cc: Will Deacon 
Cc: Robin Murphy 

Thanks,
Jean

>  include/uapi/linux/iommu.h  | 59 
> +
>  3 files changed, 113 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index f578677..0c45d4d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -3019,6 +3019,32 @@ static struct iommu_group 
> *arm_smmu_device_group(struct device *dev)
>   return group;
>  }
>  
> +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain *smmu_domain,
> + void *data)
> +{
> + struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
> + u32 size;
> +
> + if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> + return -ENODEV;
> +
> + size = sizeof(struct iommu_nesting_info);
> +
> + /*
> +  * if provided buffer size is not equal to the size, should
> +  * return 0 and also the expected buffer size to caller.
> +  */
> + if (info->size != size) {
> + info->size = size;
> + return 0;
> + }
> +
> + /* report an empty iommu_nesting_info for now */
> + memset(info, 0x0, size);
> + info->size = size;
> + return 0;
> +}
> +
>  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>   enum iommu_attr attr, void *data)
>  {
> @@ -3028,8 +3054,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case IOMMU_DOMAIN_UNMANAGED:
>   switch (attr) {
>   case DOMAIN_ATTR_NESTING:
> - *(int *)data = (smmu_domain->stage == 
> ARM_SMMU_DOMAIN_NESTED);
> - return 0;
> + return arm_smmu_domain_nesting_info(smmu_domain, data);
>   default:
>   return -ENODEV;
>   }
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 243bc4c..908607d 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1506,6 +1506,32 @@ static struct iommu_group 
> *arm_smmu_device_group(struct device *dev)
>   return group;
>  }
>  
> +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain *smmu_domain,
> + void *data)
> +{
> + struct iommu_nesting_info *info = (struct iommu_nesting_info *) data;
> + u32 size;
> +
> + if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> + return -ENODEV;
> +
> + size = sizeof(struct iommu_nesting_info);
> +
> + /*
> +  * if provided buffer size is not equal to the size, should
> +  * return 0 and also the expected buffer size to caller.
> +  */
> + if (info->size != size) {
> + info->size = size;
> + return 0;
> + }
> +
> + /* report an empty iommu_nesting_info for now */
> + memset(info, 0x0, size);
> + info->size = size;
> + return 0;
> +}
> +
>  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>   enum iommu_attr attr, void *data)
>  {
> @@ -1515,8 +1541,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case IOMMU_DOMAIN_UNMANAGED:
>   switch (attr) {
>   case DOMAIN_ATTR_NESTING:
> - *(int *)data = (smmu_domain->stage == 
> ARM_SMMU_DOMAIN_NESTED);
> - return 0;
> + return arm_smmu_domain_nesting_info(smmu_domain, data);
>   default:
>   return -ENODEV;
>   }
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 1afc661..898c99a 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -332,4 +332,63 @@ struct iommu_gpasid_bind_data {
>   } vendor;
>  };
>  
> +/*
> + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
> + *   user space s

Re: [PATCH v2 3/5] irqchip: Allow QCOM_PDC to be loadable as a permanent module

2020-06-26 Thread Stephen Boyd
Quoting John Stultz (2020-06-24 17:10:37)
> diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c
> index 6ae9e1f0819d..3fee8b655da1 100644
> --- a/drivers/irqchip/qcom-pdc.c
> +++ b/drivers/irqchip/qcom-pdc.c
> @@ -430,4 +432,33 @@ static int qcom_pdc_init(struct device_node *node, 
> struct device_node *parent)
> return ret;
>  }
>  
> +#ifdef MODULE
> +static int qcom_pdc_probe(struct platform_device *pdev)
> +{
> +   struct device_node *np = pdev->dev.of_node;
> +   struct device_node *parent = of_irq_find_parent(np);
> +
> +   return qcom_pdc_init(np, parent);
> +}
> +
> +static const struct of_device_id qcom_pdc_match_table[] = {
> +   { .compatible = "qcom,pdc" },
> +   {}
> +};
> +MODULE_DEVICE_TABLE(of, qcom_pdc_match_table);
> +
> +static struct platform_driver qcom_pdc_driver = {
> +   .probe = qcom_pdc_probe,
> +   .driver = {
> +   .name = "qcom-pdc",
> +   .of_match_table = qcom_pdc_match_table,
> +   .suppress_bind_attrs = true,
> +   },
> +};
> +module_platform_driver(qcom_pdc_driver);
> +#else
>  IRQCHIP_DECLARE(qcom_pdc, "qcom,pdc", qcom_pdc_init);

Is there any reason to use IRQCHIP_DECLARE if this can work as a
platform device driver?

> +#endif
> +
> +MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller");
> +MODULE_LICENSE("GPL v2");
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


the XSK buffer pool needs be to reverted

2020-06-26 Thread Christoph Hellwig
Hi Björn,

you addition of the xsk_buff_pool.c APIs in commit 2b43470add8c
("xsk: Introduce AF_XDP buffer allocation API") is unfortunately rather
broken by making lots of assumptions and poking into dma-direct and
swiotlb internals that are of no business to outside users and clearly
marked as such.   I'd be glad to work with your doing something proper
for pools, but that needs proper APIs and probably live in the dma
mapping core, but for that you'd actually need to contact the relevant
maintainers before poking into internals.

The commit seems to have a long dove tail of commits depending on it
despite only being a month old, so maybe you can do the revert for now?

Note that this is somewhat urgent, as various of the APIs that the code
is abusing are slated to go away for Linux 5.9, so this addition comes
at a really bad time.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu